Looking for faster strategies - 1D smoothing of 2D array

altenbach · ‎08-27-2020

I've seen it but never used it. Since the this part only accounts for a few % of the total time, gain will be marginal.

There is even the inverse function.... 🙂

I'll test it out...

LabVIEW Champion.

dpak · ‎08-27-2020

Hi GerdW,

I had never even noticed that primitive. After 21 years of using LabVIEW there are new things to learn all the time - and goodness knows how long ago that appeared. Thanks for pointing it out.

So, I have put the Ln(x+1) and Exp(x)-1 in and strangely they are slightly slower than using the separate increment/decrement and ln/exp functions. I've tried benchmarking them on different sizes of data and they seem to be about 10% slower. This test was 100 loops calculating the time for each method processing the same 10M values. I calculated the mean and standard deviation of each route based on the 100 measures.

Thanks for the new knowledge though,

David

dpak · ‎08-27-2020

Hi FireFist-Redhawk,

Here's a LV 17 version with the improvements suggested by Altenbach incorporated.

Thanks for looking at this too,

David

FireFist-Redhawk · ‎08-27-2020

Um, um... it appears to still be in 2019 😟

Redhawk
Test Engineer at Moog Inc.

Saying "Thanks that fixed it" or "Thanks that answers my question" and not giving a Kudo or Marked Solution, is like telling your waiter they did a great job and not leaving a tip. Please, tip your waiters.

dpak · ‎08-27-2020

Apologies - I used the save for previous version options. Here is another attempt

altenbach · ‎08-27-2020

@dpak wrote:

So, I have put the Ln(x+1) and Exp(x)-1 in and strangely they are slightly slower than using the separate increment/decrement and ln/exp functions. I've tried benchmarking them on different sizes of data and they seem to be about 10% slower.

I noticed the same slowdown. It is possible that these functions are relatively old while e.g. the +1, -1 primitives can take advantage of SSE instructions and can thus operate on multiple array elements at once. Just guessing.

LabVIEW Champion.

altenbach · ‎08-27-2020

For comparison, I still think that you do much data shuffling (it's the overall time that count!).

There is no need to allocate the sum array, we might as well calculate it in one place.
You can use any nonnegative value for width. Not sure why you restrict it to integer powers of two. (e.g. a width=1 would do a rectangular kernel with 3 elements)
You can eliminate one of the +1
We don't need to do the NaN check until the inner loop output
etc.

Here's a cleaned up version now wrapped into a subVI. Much less scattered code, no big penalty

(I am sure improvements are still possible, see the #add bookmarks):

A view of the caller:

LabVIEW Champion.

dpak · ‎08-28-2020

Hi Altenbach,

It's running really nicely now. Thank you for your help!

Regarding some of your questions:

You can use any nonnegative value for width. Not sure why you restrict it to integer powers of two. (e.g. a width=1 would do a rectangular kernel with 3 elements)
- Yes, you are right and I actually do. The original algorithm starts off at a coarse blur (depends on size of array, but up to 64 point half width) and then the half width for the smoothing decreases by half every iteration. This is to get rid of spiky distortions early on - not all aspects of the optimization are used in each iteration; some are not used until the some of the distortions are removed. I imagine that powers of 2 were chosen because they are easy to implement for this task (and because it worked 🙂 ). I am trying others though.
We don't need to do the NaN check until the inner loop output
- I think, for this application that I do need to do it up front - at least for now. There are other parts to the optimization - and some of it happens in a transformed domain. At some point all dimensions are processed in the log too. Some of these transformations can sometimes result in the generation of NaN - but in all cases, this value should actually be 0. But, if I don't remove these before undertaking the smooth, then a single NaN in the input array will result in a whole window width returning NaN (and then 0) in the smoothed result - where this is not the required output.
- Of course, as I work through the algorithm, improving it, I am hunting the causes of these NaNs out. None shall escape! To begin with, I was just pleased to get a version working. Now I've done that I'll be able to get rid of the sources of the NaNs and then this issue of up front NaN checking in this subvi.
- But, even at this stage, the LabVIEW version is returning run times are are beginning to look like they might get competitive with the (I believe) C original.

The key improvement you have suggested is the new (to me) way of calculating the moving point average. I had tried other different methods, but not this one, and the point-by-point NI supplied vi was always the fastest. Now I know a much better way - thank you. I think there is a case to be made for having a specific moving point average tool in the toolbox as we know there is a significant speed differential between the different methods.

Kind regards,

David

altenbach · ‎08-28-2020

@dpak wrote:

The key improvement you have suggested is the new (to me) way of calculating the moving point average.

I was actually surprised that it was so fast. I tried other well known methods, (e.g. convolution based) and they were a few times slower. Once the width gets much wider certain adjustments could be made because many additions would be carried out multiple times in the current code when taking the sum. One could keep the running sum in a scalar shift register, then add the newest and subtract the oldest with each iteration, requiring only two operations instead of [2w+1].

@dpak wrote:

We don't need to do the NaN check until the inner loop output

Yes, it is probably a good idea. to eliminate NaNs early on. In the past, we had scenarios where certain operations were dramatically slowed when NaN are present in a 32bit application (64bit applications did not have that problems!). I am not sure if this issue still exists in moderns versions, but have a look here. (Matlab had the same problem).

LabVIEW Champion.

LabVIEW

Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array

Re: Looking for faster strategies - 1D smoothing of 2D array