09-05-2008 05:41 PM
Hi all:
I meet a problem about processing 30G~50G byte data.
The thing we want to implement is to collect 30G~50G byte data, and save to txt file (any other better file type?)
Now I just tried to process 50M byte data, it seems very slow.
So does anybody know, is it possible that we can use Labview to record and process such big number of data. or is there any other good idea to handle this problem
Thanks so lot
regards
09-05-2008 05:51 PM
You should be able to do this if you are smart about it, you dont want to open a 50GB text file in LabVIEW. You should look into saving the file as a binary file, also OpenG has some file saving functions that are made for large files. If I were you I would probably save the data in chunks as binary files with a timestamp in the name scheme for the files. Then you open up successive files and read data in...
Good Luck
09-06-2008 04:07 AM
hanwei said" I just tried to process 50M byte data, it seems very slow"
Coding style is very important in such cases.
For example no data copies, do not display data that are not necessary....
09-06-2008 09:10 AM
09-06-2008 09:46 AM
Yes, "these things must be done delicately" (Wicked Witch of the West, Wizard of Oz, when trying to get Dorthy out of her ruby slippers)
1) Text files should be avoided since they require mutliple bytes to represent a single byte. "That is bad" (Mr Macky, South Park)
2) Divide and conquer. Break the data set into multiple smaller parts. A 32-bit OS only exposes about 2 Gig of memory (half of 4 Gig address space avaiable to 32 bits of addressing is reserved for the OS)
3) Newer versions of LV with 64-bit support will allow walking through larger files but you have to work with pieces and re-use your buffers as you go.
4) Work "in-place" to avoid excessive memory requirements. "Memory is like an attic, to put more stuff in it you eventually have to through something away." (paraphrase of Sir Arther Conanan Doyle)
5) As you develop watch your CPU and memory useage. When you get to a challenge, compose a specific Q and post back to this forum and let the gang help out. "There is wisdom in a multitude of cousellors." (Proverbs)
Ben
PS Wizard of OZ, South Park, Conan Doyle and Proverbs all in the same responce. Not bad, even if I say so myself.
09-06-2008 10:50 AM
09-08-2008 08:24 AM
We have told you to use binary files, but haven't given you much guidance. In LabVIEW, National Instruments gives you four options:
09-08-2008 01:49 PM
Thanks for your replying
The process that I need to do is just average value calculation, build array, waveform display and record(to txt file).
based on your and other guy's suggestion, the solution that I want to try next step are:
1. try to decreas the sampling time (to 1~2G).
2. try to use binary to record data
3. order a super computer with big RAM
Is there any other good ideas about it??
09-08-2008 02:08 PM
I went into some detail a few years ago on a similar question:
Re: Is anyone working with large datasets (>200M) in LabVIEW?
Particularly if you are just doing some simple statistics, having an understanding of this will help you. That said, there are built-in functions on the point x point palette that make this easier.
Also, you mentioned waveform display. I suggest taking a look at this thread:
Need to improve speed when graphing large arrays
Good luck!
Chris
09-09-2008 08:08 AM
Given your processing needs (averaging), it should not take lots of RAM or a fast computer to do what you want to do. The bottleneck will most likely be reading and writing to disk, so what you really want is a very fast disk or a RAID system. You should be able to achieve disk-limited speeds (anywhere from 10MBytes/sec to 100+MBytes/sec depending on your hardware). Assuming you are using some flavor of Windows, as your "light reading" said, the speed-optimized chunk size to read off the disk is about 65,000 bytes. This is not a lot of data and will not require much RAM for processing. I would recommend a multi-loop approach so your processor(s) can work most efficiently. Put the data read in one loop. Use a queue to pass this data to a second loop for processing. Use another queue to pass the processed data to a third loop for writing. Keep your read/write chunks in the 65,000 byte range (this may require buffering in a local shift register in the read/write loops - use preallocated arrays if you do this to avoid memory operations at every loop). The LabVIEW execution system will optimize threading for the three loops - you don't have to.
Good luck. If anything didn't make sense, let us know. I realize this is a lot of information to digest.