The only semi official documentation about this is with the LabVIEW GPU Toolkit. In that thread is also a link to the knowledge base article about how to write your own GPU Toolkit extensions and the archive at the end contains not only those semi official headers but also some examples. This is as far as NI ever documented anything about this.
Short from getting user ag-huber-luebeck involved in this and share his findings about how he used all this, I think you will be hard pressured to find any other information. I for my part have looked at it and tried to understand some of it but that is as far as I got. The need to transfer large memory buffers from driver level to LabVIEW and vv. with as little performance loss as possible hasn't been any of my serious concerns so far so I did not really look further into that part.
Rolf Kalbermatter
My Blog 
DEMO, Electronic and Mechanical Support department, room 36.LB00.390