Need help optimizing the writing of a very large array and streaming it a file

Djaunl · ‎06-08-2009

Hi,

I have a very large array that I need to create and later write to a TDMS file. The array has 45 million entries, or 4.5x10^7 data points. These data points are of double format. The array is created by using a square pulse waveform generator and user-defined specifications of the delay, wait time, voltages, etc.

I'm not sure how to optimize the code so it doesn't take forever. It currently takes at least 40 minutes, and I'm still running it, to create and write this array. I know there needs to be a better way, as the array is large and consumes a lot of memory but it's not absurdly large. The computer I'm running this on is running Windows Vista 32-bit, and has 4GB RAM and an Intel Core 2 CPU @ 1.8Mhz.

I've read the "Managing Large Data Sets in LabVIEW" article (http://zone.ni.com/devzone/cda/tut/p/id/3625), but I'm unsure how to apply the principles here. I believe the problem lies in making too many copies of the array, as creating and writing 1x10^6 values takes < 10 seconds, but writing 4x10^6 values, which should theoretically take < 40 seconds, takes minutes.

Is there a way to work with a reference of an array instead of a copy of an array?

Attached is my current VI, Generate_Square_Pulse_With_TDMS_Stream.VI and it's two dependencies, although I doubt they are bottlenecking the program.

Any advice will be very much appreciated.

Thanks

JÞB · ‎06-08-2009

As a GENERAL rule you do not need to store the whole array - just the information encoded in the acqusition. What is the information you need? and how can you get that info from your waveform?

Message Edited by Jeff Bohrer on 06-08-2009 06:22 PM

"Should be" isn't "Is" -Jay

RavensFan · ‎06-08-2009

One problem is that you are using Insert into Array. Build array is almost always a better and more logical choice than insert into array. But still not a good idea here. How many times does that while loop run?

Your master array grows on every iteration. LabVIEW needs a continuous memory space for storing an array. If the array grows larger than the space available where it is at, it is forced to copy the entire array to a new position. That takes time. Do it enough times, you may even run out of space because you don't have a large enough block of continuous memory to store it.

Your best bet is to calculate what the size of the array will finally be. Initialize an array of element of that size at the beginning of the program. Then use replace array elements to build the array. Since you allocated the entire size at once, LabVIEW will find a large enough space for it and will just replace the elements without moving the array. If you have LV 8.6 (which it looks like you do), using the In Place Structure will help guarantee the array stays in a single memory space.

Djaunl · ‎06-09-2009

Thanks Ravens Fan, using replace array subset and initializing the array beforehand sped up the process immensely. I can now generate an array of 45,000,000 doubles in about one second.

However, when I try to write all of that out to TDMS at the end LV runs out of memory and crashes. Is it possible to write out the data in blocks and make sure memory is freed up before writing out the next block? I can use a simple loop to write out the blocks, but I'm unsure how to verify that memory has been cleared before proceeding. Furthermore, is there a way to ensure that memory and all resources are freed up at the end of the waveform generation VI?

Attached is my new VI, and a refined TDMS write VI (I just disabled the file viewer at the end). Sorry that it's a tad bit messy at the moment, but most of that mess comes from doing some arithmetic to determine which indices to replace array subsets with. I currently have the TDMS write disabled.

Just to clarify the above, I understand how to write out the data in blocks; my question is: how do I ensure that memory is freed up between subsequent writes, and how do I ensure that memory is freed up after execution of the VI?

@Jeff: I'm generating the waveform here, not reading it. I guess I'm not generating a "waveform" but rather a set of doubles. However, converting that into an actual waveform can come later.

Thanks for the replies!

iZACHdx · ‎06-09-2009

Hello,

I have taken a look at your file and it seems that I can write 4.5 million samples no problem. The problem starts to arise when writing above 32 miilon samples. Up until this point, the write is very smooth. My computer specs are 1.86GHz dual core with 2GB of RAM. In order to get around this issue, I increased my virtual memory as is outlined here. Does increasing your virtual memory have any effect on your system? Memory allocation and deallocation is handled automatically by LabVIEW. The general rule is that any references opened must be closed. You are basically doing this when you open the TDMS file, write to it, and then close it. If you are writing the data in chunks, it should handle the allocation and deallocation automatically between loop iterations. There is a VI that can be used to request deallocation of memory, but we do not recommend it. In addition to that, you will sacrifice speed by using that particular VI.

-Zach

Djaunl · ‎06-10-2009

Hi Zach H,

I increased my virtual memory allocation to 3GB as per the instructions in the link you provided, and this indeed solved my problem. I tested by writing an array of 45 million doubles to a TDMS file, as this should be the largest array that needs to be written, and it seemed to work fine. The waveform generation is fast, but the file writing is much slower, which is expected, but I believe the TDMS write subVI is creating another copy of the data.

When I analyze my TDMS writing VI with the "Show Buffer Allocations" tool, it shows that a copy of the double array is created when I input the data ("Data in" in the provided Write_TDMS_File.VI). Seeing as the array is incredibly large I wouldn't be surprised if this is creating most of the delay when writing the file.

However, when I wire the subVI directly into the Generate... VI, the "Profile Performance and Memory" tool doesn't really show any gains. Is there some reason for this? I would expect the act of creating another copy to take some time, but apparently not.

Thanks again for the replies.

iZACHdx · ‎06-10-2009

Hello,

As you stated before, the waveform generation is quite fast. Likewise, creating a copy will be as fast or faster since you are only copying the values and not performing any other arithmetic in between. I'm not sure if it would store a copy of your data if you didn't use the subVI or not. This might be something to test. It may be showing that data because when the subVI runs, it is almost like running another program and thus it alloctes space for th same array within its memory space. I'm not surprised that writing the data takes a while. If you look at the size of the TDMS file that you generate, it is rather larg for a file containing only generated numbers.

-Zach

Djaunl · ‎06-10-2009

Yea, the majority of the time is definitely spent writing to the file. The files come out to around 360MB.

Alright I think everything is solved here, thanks to everyone for the help!

Djaunl · ‎06-10-2009

Hi,

Sorry to bring this up again, but I'm wondering if anyone has any general ideas on how to handle converting the double array intothe waveform data type. If I try to convert it before writing it out to TDMS it will of course soak up a ton of memory, and the TDMS file won't even get written. Is there some way to deallocate the memory from the initial array after it's converted to waveform but before the write happens? I understand I can use the request memory deallocate VI mentioned earlier and it might help, but I'm wondering if there's possibly a better solution to the problem.

If it helps, the purpose of all of this is so another program will eventually read in the TDMS files and then feed the generated waveforms to some PXI boards through the NI DAQmx drivers. I could convert the data to waveforms after the data is read in from the TDMS files, but when I'm reading the data and streaming it to the hardware I can't afford any real latency; it will already be hard enough to read in blocks of data every 750ms or so from ~100 of these files and output it before the next block has to be read in.

Thanks again.

parthabe · ‎06-11-2009

Djaunl wrote:

how do I ensure that memory is freed up after execution of the VI?

You can use the Request Deallocation function. See the attached pic for more details.

But as Zach had mentioned earlier, you should avoid using this function to free up memory intermittantly between your several TDMS writes, because it will hinder LV from reusing the already allocated memory.

- Partha ( CLD until Oct 2027 🙂 )

LabVIEW

Need help optimizing the writing of a very large array and streaming it a file

Need help optimizing the writing of a very large array and streaming it a file

Re: Need help optimizing the writing of a very large array and streaming it a file

Re: Need help optimizing the writing of a very large array and streaming it a file

Re: Need help optimizing the writing of a very large array and streaming it a file

Re: Need help optimizing the writing of a very large array and streaming it a file

Re: Need help optimizing the writing of a very large array and streaming it a file

Re: Need help optimizing the writing of a very large array and streaming it a file

Re: Need help optimizing the writing of a very large array and streaming it a file

Re: Need help optimizing the writing of a very large array and streaming it a file

Re: Need help optimizing the writing of a very large array and streaming it a file