08-02-2019 12:22 PM
It would be useful to know what kind of "statistics" is actually performed on the data. For many (mean, variance, etc.), The result can be obtained by only maintaining one (or very few) scalars for each channel. For the mean, all you need is accumulate the sum, then divide by N.
08-02-2019 01:44 PM
Thanks for the ideas i will try them on monday when im back.
as for the statistics required, i am still waiting for confirmation on this but i think it is just standard deviation across each channel, then that result is compared across all data samples as well as the mean and the mean of means. i already have the code to perform these calculations well on a 16 x 64 dataset, so i was going to tweak that code to work with 32 x 64
08-02-2019 01:50 PM
A fixed size lossy queue ?
Ben
08-02-2019 02:01 PM
@Dave76 wrote:as for the statistics required, i am still waiting for confirmation on this but i think it is just standard deviation across each channel, then that result is compared across all data samples as well as the mean and the mean of means. i already have the code to perform these calculations well on a 16 x 64 dataset, so i was going to tweak that code to work with 32 x 64
To get you started, here's an old example how do do a mean and standard deviation, by only accumulating a few scalar sums.
(For the mean and standard deviation, all you need is three values: N, Sum(x), Sum(x²) for each channel. Your N is fixed if you only do the statistics on a full set. In terms of resources, this takes only a very small percentage compared to keeping all points around. In your case (several channels), you'll need a very small 2D array).
08-02-2019 02:49 PM
Here you can see that keeping only a few sums (x, x²) allows calculating the same result with 64x less data in the shift register (in the case of 128 scans) . A: Keeping all data, B: Keeping a few sums.
(This uses the "sample standard deviation"> Modify as needed if you want "population SD" instead.)
08-06-2019 04:00 PM
08-07-2019 02:17 AM
thanks,
I will bear this in mind should i have issues with the existing code i have. i am still waiting for fully documented spec that tells me what calculations are required.so i am reluctant to implement anything until i know for sure.
08-07-2019 10:34 AM
No problem. I am just putting it out there for the general readership. For larger histories (e.g. 1M+) the memory and resource savings would be gigantic. Similar algorithms exist for other statistical terms, of course. 🙂