LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

laaaaarge files

Hi Group,

is there any tip for handling large data files?

I have to do calculations on spreadsheet formatted files which are
up to 500Mb long (16+ columns and lots of rows ...).
I now read them at chunks of 60.000 data values per column, using
a for-while structure and feeding "mark after read" into the "start
of read offset and it takes AGES to read the whole stuff ...

Is there any "trick" I can do to speed things up?

Christoph
0 Kudos
Message 1 of 3
(3,206 Views)
Judging by your post, it appears you are using either the "Read
characters from File" or the "Read from Spreadsheet file"
function. I did a rough test using the "Read chars from File"
function on 3MB of a file on my company's crummy Compaq Celeron
and it took about 10 secs. At that rate, it would be about 27
minutes for 500 MB. I assume you're getting about the same
ballpark performance, if by "spreadsheet format" you mean
delimited text.

How are you using the data in your calculations? Does each row
have to be processed individually? Also, do you personally have
control over the data file format? If you do, depending on what
your data looks like, you'll likely reduce your file size and
processing time a great deal by using a binary storage format.

* Sent from RemarQ ht
tp://www.remarq.com The Internet's Discussion Network *
The fastest and easiest way to search and participate in Usenet - Free!
0 Kudos
Message 2 of 3
(3,206 Views)
> is there any tip for handling large data files?
>
> I have to do calculations on spreadsheet formatted files which are
> up to 500Mb long (16+ columns and lots of rows ...).
> I now read them at chunks of 60.000 data values per column, using
> a for-while structure and feeding "mark after read" into the "start
> of read offset and it takes AGES to read the whole stuff ...
>
> Is there any "trick" I can do to speed things up?
>

No trick, but from your post, I can't tell how you are reading the
information from disk. The Advanced file I/O functions aren't really
that advanced to use, and they give much more control over how much
of the file you read and how often the file is opened/closed. The
higher level functions like Read Characters from file are easier to
us
e, and you get less control, especially over how often the file
is opened and closed.

So, I'd suggest using File Open and reading the data in using the
File Read function. You can declare the file open close to work
in a number of different ways, and this will allow you to leave the
file open, get better caching, and read only what you need.

As others pointed out, both the reading and writing of binary files
is quite a bit faster because they tend to be smaller, and there is
less formatting of the data. If you can control both ends, then you
can change to a more efficient format for processing.

Greg McKaskle
0 Kudos
Message 3 of 3
(3,206 Views)