Acquiring and Processing Large Data Sets

SuperSnake428 · ‎07-21-2011

I have a question about how to handle and process large data sets with LabVIEW. We have a sensor and custom designed IC that has 16 channels. Each channel is a 1 bit bitstream running at 10 MHz. A typical measurement consists of collecting 10 M samples, running an FFT, and extracting the value of a handful of tones. These tones are plotted overtime for a quasi-spectrogram (I say quasi because we are interested in only a handful of frequencies, not the entire spectrum.) This measurement is repeated on all 16 channels simultaneously, and then again 8 more times to complete one whole data set (128 sensors). We then repeat this procedure collecting a "data set" repeatedly for 20-30 minutes. Ideally, we'd like to have the time between data sets dominated by the acquisition (10 MHz clock rate, 10M samples => 1 second acquisition time, 8 times for a total of 8 seconds of data collection). However, right now we are highly constrained by moving around this much data and processing it. We have had to reduce the FFT size to below 1 M samples and run at a slower clock frequency (5 MHz).

I can give a lot more detailed information if desired, but what would be a best practice kind of approach for this type of application?

Our current solution uses a PCI-6289 card to collect the digital bitstreams. The data comes in as an array 16 bit words and needs to be parceled up to 16 1 bit words. Converting the 16 bit input word to 16 individual binary values is one bottleneck currently. We then run this through a highly pipelined FFT algorithm (this is running on an Intel i7 with 12 GB of memory). We chop off a large portion of the spectrum after the FFT (only need the spectrum from 1 kHz->10 kHz, the rest is shaped quantization noise) and then do the rest of the signal processing.

Henrik_Volkers · ‎07-22-2011

Sound like made for an FPGA card 😉

Do you use already the inplace structure to avoid data copies ?

Greetings from Germany
Henrik

LV since v3.1

“ground” is a convenient fantasy

'˙˙˙˙uıɐƃɐ lɐıp puɐ °06 ǝuoɥd ɹnoʎ uɹnʇ ǝsɐǝld 'ʎɹɐuıƃɐɯı sı pǝlɐıp ǝʌɐɥ noʎ ɹǝqɯnu ǝɥʇ'

altenbach · ‎07-22-2011

SuperSnake428 wrote:
The data comes in as an array 16 bit words and needs to be parceled up to 16 1 bit words. Converting the 16 bit input word to 16 individual binary values is one bottleneck currently.

Can you give an example of a typical input and the desired output. Seems like a 16 bit lookup table is all you need.

@SuperSnake428 wrote:

I say quasi because we are interested in only a handful of frequencies, not the entire spectrum.)

For a handful of known frequencies, it might be more efficient to get them directly using DFT, by not even calculating the other million frequecies in the first place. You could even store the reference waves in a lookup table. Have you tried? Do you also need the phase or only the magnitude?

LabVIEW Champion.

SuperSnake428 · ‎07-22-2011

Hi Henrik,

This was actually our proposed solution as well and one that we are currently designing. We have a daughter card with an FPGA. The FPGA has 16 CIC filters and 16 FIR filters that decimate the data from 1 bit @ 10 MHz to 6 bit @ 156 kHz. This should be much more managable on the signal processing and acquisition side with the dedicated hardware. I was just hoping to get away with the flexability of doing this all in LV.

Best,

Drew

SuperSnake428 · ‎07-22-2011

Hi Altenbach,

Sure, an example data set would be something like this:

We have a 16 bit word, and the MSB is Ch 16, the LSB is Ch 1. We have an array of 10 M of these words. We then need to convert this to 16 arrays of 1 bit x 10 M. I'm not sure how a lookup table helps here. I'd be happy to share the small snipped of code that currently does this if my explanation isn't the clearest.

In regard to the DFT, that is something we have considered in the past. We currently use the Subset FFT block and only compute the values in the 1k-10k range. This actually doesn't save as much as we'd hoped though. We are only interested in the magntiude of the response. It is very difficult to keep track of the phase in a multichanneled system like this.

Best,

Drew

dsb@NI · ‎07-22-2011

Data acquisition

Bit manipulation

FFT

Processing

As for the FFT processing, what you are requesting seems achieveable. On my dual-core, 32-bit computer which has a modest 2 GB of RAM, SVPO Power Spectrum (1 Ch DBL WDT).vi computes the full bandwidth power spectrum of a 10M sample waveform (of DBL precision values) in ~1750 ms. Using the SVFA Power Spectrum Subset.vi with frequency range set from 0 to 10kHz, the same 10M sample waveform can be processed in ~650 ms. That is only for one channel. Typically, the FFT speed scales with processor speed, and it sounds like your computer also has the memory to process several channels in parallel. After data acquisition and bit manipulation, how much time budget is left for FFT? Where are the performance bottlenecks in your code? Could you post a snippet of code to help me understand the bit manipulation better as I am not yet understanding the steps in the signal processing?

Doug
Enthusiast for LabVIEW, DAQmx, and Sound and Vibration

SuperSnake428 · ‎07-22-2011

Hi Doug,

We have tried several permutations of the signal processing ranging from processing the data after each acquisition to acquiring all of the data and then processing it.

I have a much more extensive list, but here are some run times (all for 4 M samples running at 5 MHz):

Acquisition Only (no signal processing) - 3.2s

Full FFT - 16.37s

CIC filter inside of the acquire loop - 17.89s

CIC filter after acquiring all of the data - 14.5s

Subset FFT without CIC filter - 12.9s

In terms of budget, ideally we would like the signal processing to take no time! 😉 Realistically, if it were cut down to 1-2 seconds that would be acceptable for the application. The data here is at 4 M samples because at 5 M samples it actually will never complete several of the tests and LabVIEW locks up. This would also ideally be scaled back up to 10 M samples running at 10 MHz.

I currently do not pre-allocate the memory or the arrays. I think that I looked into this in the past, but I can't remember why I didn't end up doing it.

altenbach · ‎07-22-2011

Also, if your computer has multiple cores, have a look at the NI Labs high performance libraries. It contains some parallelized FFT primitives. 😉

LabVIEW Champion.

SuperSnake428 · ‎07-22-2011

Ahh, I wasn't aware of this. We have written our own FFT algorithm for multicore machines in the past that performs pretty well. I'll definitely give this one a try though.

Thanks!

altenbach · ‎07-22-2011

SuperSnake428 wrote:
Sure, an example data set would be something like this:

We have a 16 bit word, and the MSB is Ch 16, the LSB is Ch 1. We have an array of 10 M of these words. We then need to convert this to 16 arrays of 1 bit x 10 M. I'd be happy to share the small snipped of code that currently does this if my explanation isn't the clearest.

LabVIEW does not have a 1bit datatype. What datatype is the output arrays? Are the elements U8 and either 0 or 1? Does this go directly to the FFT?

A small code snippet and some sample data would clearly help to clarify things.

LabVIEW Champion.

Dynamic Signal Acquisition

Acquiring and Processing Large Data Sets

Acquiring and Processing Large Data Sets

Re: Acquiring and Processing Large Data Sets

Re: Acquiring and Processing Large Data Sets

Re: Acquiring and Processing Large Data Sets

Re: Acquiring and Processing Large Data Sets

Re: Acquiring and Processing Large Data Sets

Re: Acquiring and Processing Large Data Sets

Re: Acquiring and Processing Large Data Sets

Re: Acquiring and Processing Large Data Sets

Re: Acquiring and Processing Large Data Sets