04-04-2018 09:42 AM - edited 04-04-2018 09:49 AM
Well, the bad news is that the DAQ code you've *got* isn't particularly close to the code you *need*. Among the issues:
#'s 1 & 2 are a problem for the integrity of the data you collect. Those need fixing before you can even believe your results. #'s 3 & 4 prevent you from collecting a continuous stream of measurements and reduce the efficiency of the run-time operation. Probably also pretty important, but I'd be prioritizing data integrity first.
-Kevin P
04-04-2018 04:01 PM
This seems quite complicated for a short term solution. Do you think if I used a PCI card and more importantly separated the calculations into another cycle, this could help to win a bit of time?
For now I would just need the acquisition to be a bit faster, not optimal.
May you have any advises upon this "short term" solution?
Thank you very much!
04-04-2018 05:23 PM - edited 04-04-2018 05:24 PM
Beware -- today's "short term" solution becomes next year's maintenance problem.
Yes, changing to PCIe might help speed things up. It surely wouldn't hurt.
Yes, moving calcs and processing into an independent loop will *definitely* help.
HOWEVER, these are both probably irrelevant if you don't first address the data integrity issues inherent in the DAQ code. I'm looking it over much more closely now...
I have to say, it appears much better than I originally thought. In fact, I think the tasks and data probably *are* sync'ed, now that I've more carefully traced through the clock and trigger terminal specs and signal connections. There's a couple little things I'd do differently, but for the most part what you've got looks like it can work effectively. The CO and AI task are triggered by the start trigger signal generated by the AO task when it starts. The CI tasks use the CO pulse as their implicit timing signal. And the tasks are sequenced to make sure AO starts last. That all makes more sense now.
I also see that the "pulse width" tasks are configured in an unconventional way where you aren't really concerned with cumulative time.
I would personally use CO.Pulse.Term as the Sample Clock 'source' input for both AO and AI but would (probably) set AO to use the rising (leading) edge and set AI to use the falling (trailing) edge. This gives the system just a little bit of time to respond to the AO value before the corresponding AI measurement is taken.
Once you do this, you'd no longer need to configure any triggering. You'd just need to make sure to start the CO task last, *after* writing your AO samples to the AO task buffer. None of the other tasks can do any sampling until you start the CO pulse task.
You don't really need any of the DAQmx Wait Until Done calls.
Do you need to change the AO waveform on-the-fly? If not, you can build up your AO waveforms *before* the loop instead of inside it. It'll also help a little bit to call DAQmx Control Task with a "commit" action on all the tasks before entering the loop -- it makes the stop/restart cycle go faster. (Search for info on the "DAQmx State Model" for more info.) I'm not sure how much difference it makes with a USB device though.
By far, the biggest speed improvement will come from moving the processing into an independent loop. The next item of concern would be the 1 MHz implicit sample rate for the 2 CI tasks. Not sure if your USB board will support that data transfer rate. You might need an X-series PCIe board. (An older M-series PCI board probably wouldn't work due to the very small counter FIFO.)
-Kevin P
04-04-2018 05:30 PM
@Kevin_Price wrote:
Do you need to change the AO waveform on-the-fly? If not, you can build up your AO waveforms *before* the loop instead of inside it. It'll also help a little bit to call DAQmx Control Task with a "commit" action on all the tasks before entering the loop -- it makes the stop/restart cycle go faster. (Search for info on the "DAQmx State Model" for more info.) I'm not sure how much difference it makes with a USB device though.
All good suggestions! I just wanted to point out that it looks like the waveforms are only generated when that blue terminal on the left equals zero. If it were not hidden behind the stacked sequence we'd be able to see that it's the iteration terminal 🙂
04-30-2018 04:21 AM
Yes, that's the point. The compute scan vi performs only when i=0. So basically we do the computation only during the first iteration step. Thus, I don't really see why should we move this part from the loop. Thank you gregoryj
04-30-2018 04:33 AM
First of all, thank you very much for your suggestions.
Regarding the changes of the Labview program, the question is whether moving the computescan.vi out of the loop will really speed up the process, since, it performs only once, when i=0?!
Besides, I did the other changes you suggested, but still using my USB 6363 NI board. Within these conditions I don't see any speed improvement.
I have ordered a PCIe 6363(X-series) NI card, so I am gonna try the same when I have it delivered. In the meantime I would really like to work on the code. But I don't see what else I could do.
Thanks again
04-30-2018 02:35 PM
Can you post your latest code? And can you confirm that the "speed improvement" you need is mainly that after finishing one cycle of ~125-140 msec of AO generation + AI and CI measurement, you want to get back around the loop and start the next cycle with less delay?
I'm surprised you didn't see some significant improvement once you moved the non-DAQ processing code over into a separate processing loop. That seemed likely to be the biggest contributor to loop overhead. That *was* one of the suggested changes you did, right?
If you didn't try that yet, then you really haven't yet determined that USB was a significant factor. I suspect you'll find that USB vs PXIe is a much smaller factor than leaving all the processing in the DAQ loop vs moving it to a parallel loop.
3 other overall thoughts:
1. It's possible your pulse width tasks ought to be period measurement tasks so that you count PMT pulses all the time rather than most of the time.
2. As long as you do this stuff by cycling through a set of finite tasks repeatedly, you can speed things up just a bit by using the DAQmx Task state model. You should then call DAQmx Control Task with the "commit" action just before the main loop. (Do this for all the tasks.)
3. The ultimate in fast cycling time would be to set things up for continuous operation instead of a repeated sequence of finite operations. Separating the DAQ and the processing into separate loops will be the first big step toward making that feasible.
-Kevin P
05-02-2018 03:47 PM
Hey Kevin,
Yes, correctly! I want to generate AO waveforms and read AI and CI with a cycle duration around 140ms and minimal delay between the cycles.
You will see in the pictures attached, that the calculation VI is now before the main loop. Though this has not sped up the overall process. I measured the time it takes for these calculations. It is around 60ms. Moreover, this calculation takes place only once. So, no surprise that there wasn't any improvement in my program's performance(time delay between cycles).
Currently, one cycle takes around 300ms. Whereas, without any delay(ideal case) it should be 140ms(waveform generation time+rest time). After having removed one of the counters(I have got 2 counters), one cycle takes around 250ms. I have made execution time measurements of each "segment" of my probe separately, so it spends the majority of time in the main iteration loop. There it reads out and writes down vectors of intensity measurements. So, I suppose that measuring "all the time" could help. The other thing would be to check how it works with a new PCIe board. Switching to continuous samples mode might be also an interesting solution.
Tomorrow, I will do the final modifications such as control task with commit action on each task.
Could I contact you afterwards?
Thanks a lot for helping me out.
05-02-2018 05:02 PM
1. Oddly, I'd suggest you try getting rid of the DAQmx Is Task Done query on the one counter task. That turns out to be a surprisingly expensive function call. See this thread for more info. I think I'd try getting rid of the DAQmx Wait Until Done calls as well because they might be similarly time-consuming.
2. Right now your error wire forces all the DAQmx calls to happen in a rigid sequence. You might see some improvement by making as much of that parallel as possible. The only critical sequencing need I see is that AO Start must happen after the AI Start and the 3 Ctr Starts and then both AO and CO Stops must not be called until after the AI Read and the 2 Ctr Reads.
3. The code you posted in msg #9 had parameters that would have called for 700k samples at 1 MHz, which should have required 700 msec for just the DAQ part. How sure are you that your config parameters really lead to an expectation of 140 msec for the DAQ?
4. Sending large quantities of data to indicators & graphs inside your main loop can also be a slow-down factor. I'd reduce the main DAQ loop to have just Start, Read, and Stop functions from DAQmx. I'd bundle the data from the Reads into a typedef'ed cluster and Enqueue it. A separate loop would dequeue that cluster, unbundle it into its parts, and then do whatever processing, file writing, or gui updates you need.
-Kevin P
05-03-2018 11:33 AM
Hey Kevin,
Here is my latest code. The main program with 2 counters and 1 analog input, and the same programs with only one CI and with just AI.
The program with only AI after the modifications you suggested runs at 150-170ms cycle, so almost as we wanted. Though when we save data and scan at the same time it slows down to 200-250ms cycle.
I can't figure out what's the problem with the start trigger, once running the program with only one counter. Could you please take a look?
I still didn't move the graphs from the main loop but I will do it soon.
Thank you