What is the most time efficient way to scan massive amounts of text data with LabVIEW?

Redstrat · ‎10-05-2009

I am currently running an application that scans data in a text files for outliers. After each file is scanned, statistics data stored in a database (if there are outliers); so at least the memory in the computer will not be eaten up. In order to scan lines of data without killing the computer, I put a 1 millisecond delay in the scanning loop. I have massive amounts of data in thousands of files to scan. Taking one milisecond per line of data is taking too much time. At this rate, it will take over a WEEK to scan all the data! Is there anything I can do to minimize the time per line scan? If anybody knows, I need a solution. If anybody thinks or knows there is NO solution, I need to hear that feedback too!

Joe_H · ‎10-05-2009

why dont you post what you already have tried and lets see if we can help improve. not knowing how you are doing things really hinders how much we can help.

Joe.
"NOTHING IS EVER EASY"

Mark_Yedinak · ‎10-05-2009

If you know teh format of your data it should be fairly easy to read your files in large chunks, say 20K at a time, pass that off to a processing task that reads through the data looking for the outliers and do whatever is necessary when you find one. I would avoid reading and processing your data one line at a time with a 1ms dealy between. Read it in chunks and process it as quickly as possible. Once you have process a chunk dispose of that data.

Mark Yedinak
Certified LabVIEW Architect
LabVIEW Champion

"Does anyone know where the love of God goes when the waves turn the minutes to hours?"
Wreck of the Edmund Fitzgerald - Gordon Lightfoot

Jarrod_S. · ‎10-05-2009

Also, a quick thing you can try is to put a 0 millisecond delay in the loop instead of 1. This does actually do something. It will yield the processor to other operations that are requesting to run, which keeps your system from locking up. However, it won't force the system to wait any specific amount of time, meaning you could speed up your processing.

I do agree, however, that processing the data in chunks is probably the best option.

Jarrod S.
National Instruments

Redstrat · ‎10-05-2009

Mark:

That gives me good food for thought. I'll try that and report back.

Message Edited by Redstrat on 10-05-2009 03:27 PM

Redstrat · ‎10-05-2009

Jarod:

0 milliseconds....Cool! I'll try that too!

Mark_Yedinak · ‎10-05-2009

I would use queue to pass the data to the processing task. You could put some intelligence in your file reading task to hold off reading a new file until the processing task has completely processing the data. However I suspect you should be able to process things fairly quickly. The suggestion to include a Wait 0 is a good one. You should always avoid writing repetitive tasks with no ability for the NI scheduler to perform a context switch. Though the various queue VIs allow the system to context switch if required.

Mark Yedinak
Certified LabVIEW Architect
LabVIEW Champion

"Does anyone know where the love of God goes when the waves turn the minutes to hours?"
Wreck of the Edmund Fitzgerald - Gordon Lightfoot

LabVIEW

What is the most time efficient way to scan massive amounts of text data with LabVIEW?

What is the most time efficient way to scan massive amounts of text data with LabVIEW?

Re: What is the most time efficient way to scan massive amounts of text data with LabVIEW?

Re: What is the most time efficient way to scan massive amounts of text data with LabVIEW?

Re: What is the most time efficient way to scan massive amounts of text data with LabVIEW?

Re: What is the most time efficient way to scan massive amounts of text data with LabVIEW?

Re: What is the most time efficient way to scan massive amounts of text data with LabVIEW?

Re: What is the most time efficient way to scan massive amounts of text data with LabVIEW?