Hi Michael,
I haven't done a lot of benchmarking on the various DAQ commands but I might be able to shed a bit of light on the lower level operation of some of these commands.
When setting up your hardware for a given task (SCAN_Setup, SCAN_Start) these commands call the driver and the driver does have a bit of overhead with these types of calls. As a result, using these commands to "configure" the onboard registers will take the given amount of time.
Since the PCI-6014 has a DMA channel (also works with interrupts anyhow), the card will automatically transfer data to PC memory. All the AI-Read is doing is copying data from the PC memory where the DAQ board is transferring data to, to the LabVIEW or Application data memory buffer. This command is quite dependent on the state of the buffer and it is a blocking call (holds the driver until it gets what it was called for). As an example, if you told it to read 1000 samples (Scans to Read parameter) and your PC memory buffer only has 10 samples, this command will wait until it gets all the data points and then transfers.
What you can do to speed your acquisition call (AI_Read) is to use the Scan Backlog parameter to monitor how much data is in the buffer and on the next iteration of the loop, only read that amount of data. This means that the AI Read will not have to wait for data to fill the buffer. It will already be present. This command is essentially a copy command. Copying memory will invariably take longer than writing to a register. Even if you are copying a data buffer in C code from one buffer to another buffer (in PC memory) it will still have a descent amount of overhead vs what we would expect.
Where NI has improved their driver is in the transfer of data from the onboard FIFO of the hardware to the PC memory. This is the real rate-determining step in data acquisition. I however see your point if you are trying to reconfigure the board quickly or in a loop and now it takes longer. These delays can add up.
Bottom line, setting up and configuring your DAQ board will be quicker if you use register level programming. However, controlling the transfer of data from the DAQ card and PC memory is less efficient (in general unless you have optimized the transfer algorithms using register level programming).
Ron
Applications Engineer
National Instruments