Can Labview process 50G byte data

hanwei · ‎09-05-2008

Hi all:

I meet a problem about processing 30G~50G byte data.

The thing we want to implement is to collect 30G~50G byte data, and save to txt file (any other better file type?)

Now I just tried to process 50M byte data, it seems very slow.

So does anybody know, is it possible that we can use Labview to record and process such big number of data. or is there any other good idea to handle this problem

Thanks so lot

regards

jon_mcbee · ‎09-05-2008

You should be able to do this if you are smart about it, you dont want to open a 50GB text file in LabVIEW. You should look into saving the file as a binary file, also OpenG has some file saving functions that are made for large files. If I were you I would probably save the data in chunks as binary files with a timestamp in the name scheme for the files. Then you open up successive files and read data in...

Good Luck

Jon McBee

Pnt · ‎09-06-2008

hanwei said" I just tried to process 50M byte data, it seems very slow"

Coding style is very important in such cases.

For example no data copies, do not display data that are not necessary....

smercurio_fc · ‎09-06-2008

With that amount of data it doesn't matter what programming language you use. You have to be smart about handling the data since you won't have 50GB of RAM. What kind of processing are you trying to do? You only indicated that you're trying to save this to file. Is this the processing that you're referring to, or are you doing some numerical computations on the data?

Ben · ‎09-06-2008

Yes, "these things must be done delicately" (Wicked Witch of the West, Wizard of Oz, when trying to get Dorthy out of her ruby slippers)

1) Text files should be avoided since they require mutliple bytes to represent a single byte. "That is bad" (Mr Macky, South Park)

2) Divide and conquer. Break the data set into multiple smaller parts. A 32-bit OS only exposes about 2 Gig of memory (half of 4 Gig address space avaiable to 32 bits of addressing is reserved for the OS)

3) Newer versions of LV with 64-bit support will allow walking through larger files but you have to work with pieces and re-use your buffers as you go.

4) Work "in-place" to avoid excessive memory requirements. "Memory is like an attic, to put more stuff in it you eventually have to through something away." (paraphrase of Sir Arther Conanan Doyle)

5) As you develop watch your CPU and memory useage. When you get to a challenge, compose a specific Q and post back to this forum and let the gang help out. "There is wisdom in a multitude of cousellors." (Proverbs)

Ben

PS Wizard of OZ, South Park, Conan Doyle and Proverbs all in the same responce. Not bad, even if I say so myself.

Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel

altenbach · ‎09-06-2008

Here is also some light reading that might interest you:

Managing Large Data Sets in LabVIEW

How Much Memory Can LabVIEW Use?

LabVIEW Champion.

DFGray · ‎09-08-2008

We have told you to use binary files, but haven't given you much guidance. In LabVIEW, National Instruments gives you four options:

Flat binary - use the file primitives to read and write binary directly from the file. This is highly flexible, but will take you a lot of time if you have multiple channels or a lot of metadata (things like scaling factors or notes).
TDMS - a relatively new file format designed to support highly parallel streaming of multiple data channels and metadata to disk. Plug-ins are available to allow common external packages such as Excel to read these files. If it does what you need, it is probably the best supported of the binary file formats. Since it is fairly new, it does not have all the bells and whistles yet.
NI-HWS - a waveform based file format which uses HDF5 (version 1.4.4 😞 ) under the hood. Since it is HDF5, it can be read by most major math packages.
sfpFile - an old API, not really supported, which gives direct access to HDF5 (still version 1.4.4). Not for the weak.

If TDMS works for you, use that. Otherwise, I would recommend you find a third party implementation of HDF5 (there are at least three I know of) and use that. National Instruments' current HDF5 offerings are very out of date.

hanwei · ‎09-08-2008

Thanks for your replying

The process that I need to do is just average value calculation, build array, waveform display and record(to txt file).

based on your and other guy's suggestion, the solution that I want to try next step are:

1. try to decreas the sampling time (to 1~2G).

2. try to use binary to record data

3. order a super computer with big RAM

Is there any other good ideas about it??

minnellac · ‎09-08-2008

I went into some detail a few years ago on a similar question:

Re: Is anyone working with large datasets (>200M) in LabVIEW?

Particularly if you are just doing some simple statistics, having an understanding of this will help you. That said, there are built-in functions on the point x point palette that make this easier.

Also, you mentioned waveform display. I suggest taking a look at this thread:

Need to improve speed when graphing large arrays

Good luck!

Chris

DFGray · ‎09-09-2008

Given your processing needs (averaging), it should not take lots of RAM or a fast computer to do what you want to do. The bottleneck will most likely be reading and writing to disk, so what you really want is a very fast disk or a RAID system. You should be able to achieve disk-limited speeds (anywhere from 10MBytes/sec to 100+MBytes/sec depending on your hardware). Assuming you are using some flavor of Windows, as your "light reading" said, the speed-optimized chunk size to read off the disk is about 65,000 bytes. This is not a lot of data and will not require much RAM for processing. I would recommend a multi-loop approach so your processor(s) can work most efficiently. Put the data read in one loop. Use a queue to pass this data to a second loop for processing. Use another queue to pass the processed data to a third loop for writing. Keep your read/write chunks in the 65,000 byte range (this may require buffering in a local shift register in the read/write loops - use preallocated arrays if you do this to avoid memory operations at every loop). The LabVIEW execution system will optimize threading for the three loops - you don't have to.

Good luck. If anything didn't make sense, let us know. I realize this is a lot of information to digest.

LabVIEW

Can Labview process 50G byte data

Can Labview process 50G byte data

Re: Can Labview process 50G byte data

Re: Can Labview process 50G byte data

Re: Can Labview process 50G byte data

Re: Can Labview process 50G byte data

Re: Can Labview process 50G byte data

Re: Can Labview process 50G byte data

Re: Can Labview process 50G byte data

Re: Can Labview process 50G byte data

Re: Can Labview process 50G byte data