08-11-2011 06:35 AM
Hi Engineers,
I am looking for some techniques or algorithms to compress the Sine Wave waveform data.
(Some of the changes I already done like DBL to SGL format and 16 bit integers )
I can't effort the sample loss.
Thanks and Regards
Himanshu Goyal
08-11-2011 06:51 AM
Himanshu,
is this for an aquisition?
The best compression for a waveform is the mathematical function which "builds the waveform". So if you have a sine wave with 100 samples per period, you can compress that in the best way to f(x) = a * sin(x). So your 100 samples (most probable 8 bytes per sample due to being double) will be reduced to the formular.
This is in fact compression per frequency domain, so "simple FFT".
Nevertheless, this contradicts to your sentence: I can't effort the sample loss ("I can't effort loss of samples"?). This will always contradict to compression since compression is always a loss of data.
That being said, i am wondering why this is "lossless" is important for you. Is it, because you need information about noise?
hope this helps,
Norbert
08-11-2011 07:03 AM
Hi Norbert,
Thanks for your quick response.
I am acquiring the signals from the filed and the acquisation rate is 10KS/s for 30 channels. Now i am looking for log the data.
without any conversion if i log the complete data the file size is 6.xx GB for 1 Hour. So i am looking for some techniques that can help to compress and decompress the complete data.
Thanks and regards
Himanshu Goyal
08-11-2011 07:16 AM
Himanshu,
simple mathematic for binary files:
30 channels a 10KS/s presumably double => 30*10.000*8 Bytes/s = 2.400.000 Bytes/s (roughly 2MB/s).
Running the application for 1 hour should result in 2MB/s*60*60 = 7.2GB.
So the file you are getting is the most condensed version of data without lossing information.
Loss of information will bring that down to less amount of space needed, but you have to consider which information you want to discard.
Possible ways:
1) Convert all data to Single: You will lose information, eventually cutting off values if the values are very big or very small. On the other hand, you cut down the space to 1/2 (3.6GB/h)
2) Averaging: Calculate the average for several values. This is ok for good oversampling (>1000) and small numbers (<50) esp. when the signal has lots of noise. The space needed is cut down by the amount of values you build the average on. Please note that you cannot use a moving average (which is in fact a simple filtering method).
3) Calculate a form-fit function for packages of the signal and store the parameters for the given function: Best compression, but will lose nearly all information of the waveform and introducing uncertainties by the form fit function (increasing errors). In addition to that, packages might have steps to one another since the form fit function will not result in a continuous function without steps.
There are more methods for sure, but those are the most basic and common that i can think of.
hope this helps,
Norbert
08-11-2011 07:41 AM
Dear Norbert,
Thanks for your suggestion and support.
1. As I already mention in my starting form that data already convert in SGL format(you cut down the space to 1/2 (3.6GB/h)) :manvery-happy: work Fine.
2. The data is in Sine wave so I think this avg will not help me.
3. Not Getting
One more thing I want to discuss with you that I convert the data to SGL format and flatten to string and this string data storing in Binary file.
In this binary file logging program I build a logic that the file size whenever reach to 2 GB program will ZIP the file. So for 2 GB file after zipping the file size is around 300-400 MB. But issue is I have to generate report based on stored data, I have to unzip first ZIP file and then only I can generate the reports. This complete process takes some minutes time. That is not good as a application point of view. I am looking for a solution to avoid this.
So If you have any idea and suggestion about this issue please suggest.
Thanks and Regards
Himanshu Goyal
08-11-2011 07:53 AM
@Himanshu Goyal wrote:
[...]One more thing I want to discuss with you that I convert the data to SGL format and flatten to string and this string data storing in Binary file.[...]
Himanshu,
do not use flatten to string. This is
a) no binary file anymore (well, you'd save the binary U8 array representing the string, but this will be eventually more than 4 Bytes per number)
b) additional manipulation of the data possibly discarding precision.
Store the SGL values directly to file using "write to binary file". This is the most sufficient way to store data (without compression).
The reader of course has to read and therefore has to know the layout (datatype, 1D vs 2D array, ..) of the binary file.
Because your question if fairly common to test and measurement, you might be interested in the TDMS file format.
This is not as sufficient as binary (but quite close), but adds a ton of useful features.
hope this helps,
Norbert
08-12-2011 07:26 AM
Hopefully TDMS will solve your issue. If you need further size reduction, it appears you want in-line, lossless compression of your data, which TDMS does not currently support. You have several options:
08-12-2011 08:52 AM
Since you have 30 channels I doubt your AD is more than 16 bits in resolution. You can cut your data amount in half by using a 16 bit integer type for storing the data. You will not loose anything in data resolution by doing this. Since your resolution is limited by the AD. You can convert to floating point type after read. The extra programming required will be minimal
08-12-2011 10:10 AM
You might be able to implement a technique similar to what most of my scopes do where the digitized samples are stored as I16 integers and digitizer scale and offset factors are stored somewhere in the file header.
The ADC turns your analog signal into a binary number based on the ADC’s resolution or number of bits. Tutorial http://zone.ni.com/devzone/cda/tut/p/id/4806 explains it far better than I can.
For example, an 8-bit ADC divides an analog signal into 256 steps, for a signal with a 10-V range, each step would be about 39-mV. Rather than storing the digitized signal as DBL or SNG, divide by resolution of the digitizer and save as U8/I8, (also save the resolution so you can get the original signal back).
(actual conversion will depend on your data and digitizer, e.g. a -5-V to +5-V signal could be stored as I8 with a scale 39-mV and offset 0-V or as U8 with a 39-mV scale and -5-V offset, 0-V to 10-V could be I8 with scale 39-mV adn offset 5-V or U8 with scale 39-mV and offset 0-V)