How to write and process large 600M array?

ortiga · ‎07-12-2013

Hello All,

I need a help in writing a large1-D array ( array size up to 600M) of double to binary files. I am trying to use Producer/Consumer Design Pattern to do this. In Producer Loop I am acquiring data from single channel of the card, 100k samples per iteration.When I try to build an array the memory is full about 70M size of array. It is possible to write this large array?

I would like to write every 50M samples into separated binary file and I guess I have to do this in Consumer Loop, but I can't figure out how.

The second problem is how to read this entire 600M array from binary files and process it in FFT?

I placed SineWave block instead of block accountable for acquiring data from card, because I can't post the original program. I am using LabVIEW 2010 on Windows 7 32bit.

I don't know whether my approach is applicable to so large data. Any help in this case will be appreciated...

Regards
.

AECmartin · ‎07-15-2013

Hi ortiga,

I explored your VI and there are few things causing issues. First of all, you do not need to store the data to array since you are using queue. The queue will transfer data and guarantee that all data with correct order will be read from queue in consumer loop. So in the producer loop, you will just acquire data from HW and then directly sent them to the queue in chunks with defined size (e.g. 100k samples).

In consumer loop, there will be only dequeuing the data and saving them to the file with some logic responsible for dividing the data into series of files.

You can find and brief example with initial idea how to send a data and save it to a file. Please note that saving 600M samples of data will need about 4.5GB of data on the hard drive. So also, you have to consider if the rate of acquiring the data is on speed which you are able to save it to a file.

I hope this helped, please send me a feedback or questions if any.

Best regards,

Martin

AECmartin · ‎07-15-2013

Hi ortiga,

I forgot to write about opening the data and computing the FFT. You can try to open that big data and call FFT function on it directly but you will need 64bit Windows and at least 8GB of RAM. Another solution is to decimate (or interpolate) the data and process it. For processing that large amount of data is usually done with DIAdem. Also, it is recommended to have data points in a power of 2, then the processing is more efficient.

Best regards,

Martin

ortiga · ‎07-15-2013

Hi Martin,

Thank you for the vi, it really helps me to understand how queues work. I am trying to implement some logic responsible for dividing the data into series of files, but I began to consider whether it would be better for me to process the data at once in consumer loop without writing and reading from disk. Is this solution could be faster?

I want to acquire the data (100k samples) in Singleshot mode with Sampling Rate 50 MHz during the time up to 1 minute. If my calculation is correct I receive 60*50M = 3G samples. Can I process this large data set in FFT in consumer loop?

How to make FFT work correctly? The magnitude and phase waveform disappear when the acquisition is finished.

Another VIis attached, which shows my approach. I also attach the picture of Settings cluster of my acquisition card, maybe it helps you to check if I think correctly.

Thank you so much for any help,
Regards

AECmartin · ‎07-16-2013

Hi ortiga,

I think you need to do few consideration steps. Do you really need to continuously acquire that fast rate? If you use some digitizer which is usually acquiring 12bit data, you will need about 100 MB/s speed (50MHz*2B=95MB). For long time acquisition it is very difficult to measure. What hardware do you use? For PCI cards it is quite impossible to acquire that fast rates since theoretical bandwidth id 133MB/s. And also you will need the disk solution (maybe SSD or RAID) capable of saving the data that fast.

I guess that decimating (or acquiring the data at lower rate) will be enough to process the data by FFT. It is really difficult to process that big data, especially on 32bit system (you will be out of memory shortly).

In your code, I corrected few things. First, the problem with disappearing was caused by timeout, adding another case structure will handle the situation when timeout occured and processing will be made only when there is no timeout. Averaging looks ok, but I also added the First Call VI which will restart averaging when you rerun the VI (otherwisely, it will use data from previous run of VI).

Another problem is the data size. You obtain a queue with 32bit long datatype, but then you are trying to pass a 64bit long data. I should warn you that in this case the coercion dot is made on Enqueue Element and this means that data is copied (it adds more memory consumption). You have to strictly follow one selected datatype (based on the datatype returned by hardware you use) to avoid unnecessary memory problems. In program attached, this mistake is corrected with proper typecasting and you should double check the datatype also with your hardware and follow this datatype.

Please send the hardware you use and brief description what you are trying to measure, I will be able to help you better with this knowledge.

Best regards,

Martin

ortiga · ‎07-16-2013

Hi Martin,

I think there is no problem with the data size, because the datatype returned by card is 32bit real (6 digit precision). I connected wrong datatype to Obtain Queue block.

I am using Spectrum Card M2i.2030 PCI/PCI-X and PC with 4 GB RAM, Windows 7 32bit.

I need to remake the part of program (figure1) which reads the data from card (Sampling Rate, Mem are set in Settings Cluster of the card) and then processes it in FFT in single while loop. The ‘READ FLOAT’ block is responsible for reading the time signals with parameters defined in Settings Cluster (figure2). Then they are processed in FFT and on the output we have clusters or waveforms, which contains the part of spectrum where resonace occurs (FFT Mag, phase). This part of spectrum is specified by input parameters (Start Frequency, End Frequency), which are used to scale waveforms.
This program is executed as subVI in for loop in larger program e.g. 40 times. I want to do the FFT much faster, because in the future I will need to perform this subVI continously during the specified time. The new version of program using queues is executing in similar time as the original version. Could help me how can I make executing faster?

When I started to work on remaking this program I was advised to do this writing the all time signals to binary file, then read and process it, but I don't think it is a good idea. Please tell me if I am wrong.

I don't know whether I explain you what I have in mind, but I hope I will make this clear gradually. I attach screens of my orginal and new version of program (figure3) and also the Settings Cluster.

Thank you much Martin,

Regards

AECmartin · ‎07-17-2013

Hi ortiga,

I still do not have an information from you what you are trying to measure (in last post, you described how you are going to measure it).

If you are able to measure the data and save it to a file, it can be good idea. Continuous acquisition and processing on that big data is usually big challenge, it can either be not possible with 32bit operating system since you have only 2GBs of RAM available for your application. The question is if you are able to store the acquised data. You have to consider sampling rate and number of measured samples. Then you will know amoung of data you will get, so you will be able to make considerations. But it is true that if you need to save data to a file, the fastest option is to use binary files

To be honest, I do not think that you need to process that big amoung of data, probably it would be enough to measure with lower sampling rate, get less data and process it. So please send me description what you want to measure and provide all information regarding to the desired sampling rate, number of samples (acquisition time), so we can discuss about it.

Best regards,

Martin

ortiga · ‎07-17-2013

Hi Martin,

I am trying to measure the response of cantilever on excitation by sweep signals, which are generated by function generator. Each sweep signal is specified by Start Frequency, End Frequency, Amplitude and Sweep Time, which defines how long lasts this signal (usually in range of 1-5 ms). So the function generator sends this short sweep signals and excite the cantilever continuously. The response of excitation is proportional to the voltage from piezoelectric transducer, which I am exactly trying to measure. So we have the values of voltage in time domain acquiring in single channel of card. In the old program sampling rate is set 50 MHz and number of samples to acquire once (Singleshot mode) is 200k in order to read single 4ms - sweep signal from card per one execution of ‘READ FLOAT’ block, which means per one iteration of while loop. For instance, when I want to acquire in time of 20s I need 5000 4ms-sweep signals (1G samples), so in the old program the while loop has to iterate 5000 times and it lasts 183 seconds (acquiring and FFT processing) (old_program_frontpanel picture shows front panel after this measurement). I have to remake this program to do it faster.

The question is what solution is better: to acquiring the data and processing it at once or acquiring, writing to the binary files, reading and processing?

I hope I make this more clear. I attach the pictures of old program.

Thank you Martin,

Regards

AECmartin · ‎07-18-2013

Hi ortiga,

I think it would be better to save the data to a file and process it offline. But still, you are acquiring lots of data and you should consider HW capabilities. The hardware is big bottleneck here, processing FFT on 200k samples for 5000 data frames is really difficult to process quickly. According to my computations and little benchmarking, you will need about 15 ms for every FFT processing of one 200k samples chunk. This will result to at least 75 seconds for just FFT. In my opinion, you need to lower the number of samples or make the measurement shorter to reduce the duration.

Better HW can help, e.g. you can acquire the data, save it and process later. But it will need probably SSD because rate will be about 200 MB/s (1G samples, one sample has 4 bytes, so totally 4GB, duration is 20 s -- 4GB/20).

Best regards,

Martin

ortiga · ‎07-18-2013

Hi Martin,

I can use only the current hardware, so any improvement of the old program is desired. The acquisition time of 1 minute is the maximum so usually measurements will be performed shorter. I will also try to reduce the number of samples.

Could help me how to remake the old program in order to acquiring and processing the data faster? Should I use queue operation to write the data in chunks? And what is the optimal size of one chunk which I should write to each file? Could you advise me how to write data e.g. every 200k samples? Should I use the Array Size block to control the number of acquired samples and then write to file?

I tried to process the data in consumer loop, but when I wanted to process 1 G samples memory was full, but using the old program it is possible to process this large data in just single while loop. Maybe I don't use queues properly?

Thank you for any help Martin,
Regards

LabVIEW

How to write and process large 600M array?

How to write and process large 600M array?

Re: How to write and process large 600M array?

Re: How to write and process large 600M array?

Re: How to write and process large 600M array?

Re: How to write and process large 600M array?

Re: How to write and process large 600M array?

Re: How to write and process large 600M array?

Re: How to write and process large 600M array?

Re: How to write and process large 600M array?

Re: How to write and process large 600M array?