PXI-6602 buffered event counting with pretrigger samples

drotarj · ‎01-12-2007

I have a PXI system with a PXI-6602 card that I am using to do buffered event counting. I configured a counter to generate sample clock pulses for the buffered event counting tasks. Using DMA for each channel (for up to 3 channels), I was able to achieve a bandwidth of about 3 MS/s. However, I modified the program so that it can acquire pretrigger samples. After doing this, I only get 500 kS/s bandwidth.

Since the 6602 does not appear to support a reference trigger for buffered event counting, I did the following:

I configured the event counter task to acquire samples continuously (in a presumably circular buffer) 1 s after starting the task (the 1 s delay is to avoid having DMA transfers occuring while other tasks are still being started); in addition to the sample clock counter, I configured a counter to generate a 5 s pulse after all of the postrigger samples have been acquired; this counter is used as a pause trigger for the event counting tasks; yet another counter is set to create a pulse after being triggered by the pause trigger counter; this counter produces a done event to signal that the acquisition is complete and that data is ready to be retrieved. This method uses 3 counters (sample clock, pause trigger, done event generator) in addition to the counters used for counting events. Various properties had to be set manually, like the buffer size. I also had to set the task so that it would allow overwriting of samples.

While the above method was somewhat awkward, it seemed to be the simplest way to implement a reference trigger for the 6602. Other than the lowered bandwidth, it worked flawlessly. I don't see why the bandwidth should be less than the 3 MS/s that I achieved for the finite acquisition, especially since I was able to achieve that rate even for very large buffers (5 MS).

I would like to know:

1) Why is the bandwidth being lowered?

2) Can I modify my program to avoid the bandwidth reduction?

3) Is there a simpler way to implement a reference trigger for buffered event counting?

Thanks for any help you can provide.

Jason

OS: Windows XP

compiler: mingw C++ compiler

driver: NIDAQmx

Kevin_Price · ‎01-12-2007

1. I referred back to the datasheet to confirm what I suspected. Sure enough, the transfer rate is very much faster in finite acquisition mode than in continuous mode. I don't know the exact reason why, but it seems inherent in the board / driver.

2. So no, there probably isn't anything you can do to regain that bandwidth while in continuous mode.

3. Just thinking out loud now. I'm assuming that you'd like to get back to finite acquisition mode for the bandwidth, but need a way to fake a reference trigger. Let's stick with the basic idea of 3 buffered event counters and 1 separate counter generating a sample clock. I suspect 1 more counter should be enough, but it isn't yet clear to me what other signals / timing info you have to work with. What is your "reference trigger?" Where does it come from?

If you start your 3 buffered edge tasks before starting your sample clock, you can be sure the samples indices are in sync. The only little glitch is that the very first value measured will be # edges between starting the event task and starting the sample clock. The tasks are started with software "Start" calls, so the 3 edge counters start in quick succession but not simultaneously. Personally, I'd just subtract index 0 from each of the arrays of edge counts and treat the result as "# edges since initial sample clock."

Next, I'd configure 1 more counter to do a single unbuffered "two edge separation" measurement. I'd use the sample clock counter output as the timebase to count # of samples between edge 1 and edge 2. I'd also use the sample clock as the initiating 1st edge and the reference trigger as the 2nd edge. Give or take an off-by-1 you need to be careful with, the resulting count value tells you the index into the edge count arrays where the reference trigger occurred. Now you can post-process as desired with whatever quantity of pre-trigger and post-trigger samples are available.

One last tip: I'd make the sample clock be a very short pulse rather than a square wave. Then I'd be careful to make the two edge separation pay attention to both. One edge would be used for the initiating 1st edge, while the *other* edge would act as the timebase. This avoids a hardware race condition. You'll also want to consider which edge is most appropriate for the sampling clock on the edge-counting tasks. Another one of those little things that can make an off-by-1 to deal with.

-Kevin P.

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

drotarj · ‎01-12-2007

Kevin,

Thanks for your reply. To answer your question, my reference trigger is the start trigger from an analog input card (PXI-6132); the AI task waits for a signal to get above a certain level before starting. I knew about the bandwidth differences for finite and continuous acquisitions, but didn't think this mattered, since my acquisition task uses a circular buffer and is constantly overwriting old values with the new ones. Also, the 500 kS/s is about 10x higher than the stated continuous acquisition bandwidth, while the 3 MS/s bandwidth that I get for finite acquisitions is comparable to what is stated in the datasheet. Incidentally, when using my "reference triggered" buffered event counting task, I always get an error saying that I overwrote samples before they could be read (which is, of course, true). However, this doesn't prevent me from reading the buffer or getting the correct data. I set overwrite mode to DAQmx_Val_OverwriteUnreadSamps, but still get the error. Could it be that interrupts are being generated when samples are overwritten, thus reducing the bandwidth?

Jason

Kevin_Price · ‎01-12-2007

You made a good point I didn't think through. The benchmarking for continuous acquisition probably *does* imply a case where the app is making sure to pull every sample out of the DAQ buffer. So it makes sense that the max rate sustainable for the long haul is much slower than a short finite acquisition "burst." Your case is different though. I agree that it should just cruise along overwriting the circular buffer until you happen to request the little bit of data you care about. Dunno why there's such a large discrepancy.

That overwrite error sounds familiar. I had a circular buffer app where I occasionally wanted to query the most recent N samples for a user display. I also set the mode to allow overwriting and I also got that error when trying to read. I *think* however that because of the error, I *didn't* get any data.

I had to manually setup read properties to read "relative to most recent sample" with an offset of (-N) in order to read N samples without an error. That was back in DAQmx 7.x, and I would have thought that was a glitch that got fixed by now. I haven't happened to explore that kind of thing lately. Also, I'm no help for syntax as I've only used LabVIEW and I see you're programming in C++.

Maybe someone from NI can comment on the speed of finite acq. vs circularly buffered acq that's allowed to overwrite and lose data?

-Kevin P.

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

reddog · ‎01-14-2007

The performance difference between finite and continuous acquisitions is unrelated to the overwriting of unread samples. The reason for the performance difference has to do with how DMA is implemented by the hardware and the fact that the 6602 in effect has a two sample deep FIFO. When acquiring data and you reach the end of the buffer, the 6602 must reset the data transfer back to the beginning of the buffer. This operation takes a fixed amount of time at which point data is being backlogged in the device's FIFO. However, since the FIFO is so shallow, it doesn't take too much for an overflow to occur while this process is ongoing. For a finite acquisition, no reset needs to take place and the sustained acquisition speed is much higher. The following knowledge base has been known to increase the rate of sustained continuous acquisitions for some people. I've never tried it so I can't say how much it might help. The tradeoff to using this approach is that it will require more CPU resources than the default options.

In terms of the buffer overwriting, I think the concept that is being missed is that the driver maintains virtual buffer positions and not absolute positions. For instance, say the buffer size is 1,000 and you've read 100 samples from the buffer. The current read position is now 100. However, 5,500 samples have been acquired in total before trying to read again. While data is still successfully being written to the buffer, the read call will error because you are trying to read beginning at sample 101 and only samples 4,500 - 5,500 are available. To circumvent the error, you need to change the Relative To or Offset read property as appropriate so the read position references a valid sample that can be read. Hopefully this information helps and clarifies some things.

Kevin_Price · ‎01-15-2007

Thanks for clarifying, reddog. It helps a lot to learn more about why things work the way they do, such as the throughput drop for continuous acquisition.

Also, good point on emphasizing the distinction between the actual overwrite of a circular buffer and the later Read call that attempts to retrieve data from the buffer. I've apparently had a wrong notion of what the "samples were overwritten before they could be read" error meant. I took it as implying that the overwrite was the cause of the error rather than the read itself. Apparently I never tried querying the task status before attempting the read -- with my faulty understanding, I'd have expected the status query to reveal an overwrite error. Had I tried it, I wouldn't have gotten the expected error and should have been able to discover that it was my Read call that created the problem. Anyway, your explanation definitely helped me make the mental connection about why the read properties need to be adjusted in order to avoid the error.

Small suggestion: if ever the error text is being updated, it'd be helpful to inject some #'s into it, along the lines of, "failed attempt to read at sample #101. Earliest sample in buffer is #4500."

Couple followup questions while on the topic: when setting Read properties for "RelativeTo" and "Offset" with a DAQmx Read property node, are they both persistent and independent? If I change just one of them, say "Offset", will my prior value for the "RelativeTo" property still hold (and vice versa)? Do they need to be set after the task has been started, or can they be configured prior to task start? Once I perform a Read with a non-default setting for RelativeTo and Offset, what happens to the internal read mark? Does it always move to point one sample later than what was just read or does it only get adjusted when the default settings for RelativeTo and Offset are both intact?

Sorry for so many questions -- I'd experiment a bit, but I won't be able to get on a LV PC for several days and am liable to forget by then.

-Kevin P.

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

drotarj · ‎01-16-2007

Reddog,

Thanks for your reply. It really helped clarify things. However, I still have two questions:

1) I was already setting RelativeTo to MostRecentSamp and setting the offset to -samples. This gives me the correct data, but also gives the error I mentioned. Are these the correct settings?

2) The PXI-6602 uses a MITE chip. Doesn't that chip have its own FIFO? Its hard to imagine the DMA transfer could work at all with only a 2-sample FIFO, except at a very, very slow rate. I found that, if I don't use a reference trigger (and, thus, avoid the issue you mentioned), I get 3 MS/s with one counter and 1MS/s/counter (= 3MS/s, overall) with 3 counters. Also, the acquisition is mostly unaffected by sporadic activity such as moving the mouse or opening a folder during the acquisition, unless the activity is very high or continued for about 10 seconds or more. It is hard to imagine this kind of performance with only a 2-sample FIFO.

Jason

Kevin_Price · ‎01-18-2007

1. Hmmm. When I set RelativeTo = MostRecentSample, and Offset = -(# to Read) I got data *without* an error. I had also set the property allowing unread samples to be overwritten. I posed a question earlier in the thread about persistence -- perhaps that's a factor?

In my app, I created a kind of DAQmx task driver with an interface supporting different types of Read modes. At the time, we were not fully decided on whether we would need to perform continuous streaming, take occasional snapshots of the recent past, or request the next several future samples. So the code inside would set the RelativeTo and Offset property on every single call because in principle they could change from call to call. I'm not sure that I ever investigated whether the property settings would have remained persistent over the course of many Reads, though it sure seems like they *should*.

2. Can't speak to the hw directly, but the 2-sample FIFO has been known as a bottleneck for years. Here's hoping that we soon see a new multi-channel counter board. The 660x series is from pre-Y2K, which is getting old for a DAQ board design...

-Kevin P.

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

drotarj · ‎01-18-2007

Kevin,

Thanks for the reply. I was already setting the appropriate property to allow overwriting of samples, but I still get the error. I do still get the correct data, so I can live with the error. However, the fact that I get an error makes me think that I'm doing something wrong. It is interesting that you don't get the error, since your case is very similar to mine.

Regarding the 6602 FIFO, I ran a finite acquisition, but set the sampling mode to continuous. I set the buffer size to 1000000. If I acquire 1000002 samples or less, everything is fine. If I go above 1000002 samples, I don't get any data. This seems to suggest that the 6602 really does only have a 2 sample FIFO, as reddog pointed out. It still would be helpful, though, if someone could comment on whether or not the MITE chip has any additional FIFO memory. Also, is the FIFO 2 samples per channel, or just 2 samples?

Jason

reddog · ‎01-24-2007

Kevin,

First, good suggestion on the error text. I'll pass that along. I've always found the error a bit cryptic as well, and I think your suggestion would help clarify things. We've also considered throwing a warning instead of an error and auto-seeking the read position to the next availabe unread sample, but I'm not sure if that's desirable in all situations or any less confusing.

Second, the "Relative To" and "Offset" properties are sticky. They don't reset to the default values after each read, and you can set them before starting the task and while the task is running. The current read position is always one sample later than the one just read. If you set the "Relative To" property to most recent sample and "Offset" to -1000 and the most recent sample is 2500, the current read position will be 2501 once the read completes. If you then change "Relative To" to current read position and "Offset" to 0, you will begin the next read where you left off at 2501. If you leave the "Offset" property unchanged at -1000 and only read 1000 samples with each read call, you will read the same 1000 samples each time. Hopefully this clarifies the behavior a bit.

Jason,

To be honest, I'm not sure why you're still getting a buffer overwrite error with the settings you described. The only guess I can make is that your negative offset is larger than the buffer size, but that seems unlikely. Post your code and perhaps someone will be able to give a better suggestion.

Yes, the 6602 does use a MITE chip for DMA. While the MITE does have its own FIFO, I don't think it's going to help much in this situation. When DMA is ongoing with counter measurements, you can think of it as three processes occuring in hardware. The first process is effectively latching the counter data and storing it in the onboard device FIFO for that channel. The second process is transferring data from the device FIFO to the MITE FIFO, and the third process is transferring data from the MITE FIFO to the buffer in host memory. When the data transfer needs to wrap to the beginning of the host buffer, the data transfer through the MITE pauses momentarily while the MITE hardware resets itself to the beginning of the buffer. While this happens, data accumulates in the device FIFO until it overflows. How long it takes to overflow is generally a function of the acquisition rate, how long it takes the MITE to reset back to the beginning of the buffer, and how much traffic there is on the PCI bus. Given this, it's not surprising you can acquire one channel at 3 MS/s or three channels at 1 MS/s.

Counter/Timer

PXI-6602 buffered event counting with pretrigger samples

PXI-6602 buffered event counting with pretrigger samples

Re: PXI-6602 buffered event counting with pretrigger samples

Re: PXI-6602 buffered event counting with pretrigger samples

Re: PXI-6602 buffered event counting with pretrigger samples

Re: PXI-6602 buffered event counting with pretrigger samples

Re: PXI-6602 buffered event counting with pretrigger samples

Re: PXI-6602 buffered event counting with pretrigger samples

Re: PXI-6602 buffered event counting with pretrigger samples

Re: PXI-6602 buffered event counting with pretrigger samples

Re: PXI-6602 buffered event counting with pretrigger samples