Ways to speed up file reads of mixed data type tables?

Mads · ‎09-08-2015

I am working on a new chunk based file format for logging of time series ( for example when logging small and varying groups of data frequently to the same file, as we need fast search and read times), and ran into a little performance bottleneck: reading rows of clusters is 10x slower than reading rows of doubles...I expected it to be a bit slower, but not that much.

Here is a simplified test example:

The time stamps need to be a double (at least), so if I am to support other types for the rest of the channels in a group I have to mix the data types. I can do that in different ways (separate the write/reads of each data type for example), but the fastest option seems to use a cluster consisting of a time stamp, and then an array of the other channel data (I am not aiming at supporting more data types within the same channel group). Unfortunately even with that method, the reads will be slowed down by a factor of 10.

I'm aiming at read speed of about 100-500 ms per month of 1 second data for example, that's 2592000 samples per channel searched and extracted within that time (with the type of logging we are doing a TDMS file is about 1000x (!) times slower due to defragmentation). The goal is simple to achieve with pure DBLs, but if I use the cluster approach to save the channels as SGLs instead (to save some disk space) the time is above 1 second.

Now, I do have a way to get around this already; I can save the time as SGL if I relate it to a starting time, and keep the maximum time frame per chunk small enough for the resolution to be OK...That brings the above mentioned read time down to 100 ms again...But perhaps there are other ways to attack it?

(The reason I', not using TDMS is as mentioned the problem it has with defragmentation, and using a database is tricky (not much options, and those that exist are costly) as we need to also support (store locally on) a variety of real-time targets (where the read times get even worse yes...).).

Mads Toppe
Check out our Modbus Test Master - developed in LabVIEW

lhedlund · ‎09-14-2015

Dear Mads,

I will recommend you to have a look at the following LV library.

http://www.ni.com/example/30348/en/

Otherwise, writing binary files is the right way to go in this matter. If you like to speed things up you are on the right way making as precised writing and reading as possible. Every extra unnessesary bit will add time.

Best regards

/Lars

Mads · ‎09-14-2015

Thanks for the tip, I had not seen those before. We have to support non-Windows targets as well though, so I have to stick with the cross-platform file functions.

I have implemented a solution that use a relative time stamp when using SGLs instead of DBLs. It works nicely, with very fast read times. I guess we will have to use the slow cluster approach when it comes to data types with less bits available.

Mads Toppe
Check out our Modbus Test Master - developed in LabVIEW

LabVIEW

Ways to speed up file reads of mixed data type tables?

Ways to speed up file reads of mixed data type tables?

Re: Ways to speed up file reads of mixed data type tables?

Re: Ways to speed up file reads of mixed data type tables?