Annoyingly inconsistent execution times for averaging operation

Andrey_Dmitriev · ‎05-28-2025

@altenbach wrote:

You can parallelize the heavy lifting and stuff the turkey later.

On my rig: Your code 133ms, Code below: 25ms (same result!)
...

On a side note, it might help to write a median that operates directly in U16. Very long ago, we had the median coding challenge (~LabVIEW 7.0) and "quickselect" is not that hard to implement. Maybe I still have some old code somewhere... 😄

Thank you so much, Christian — really simple! I tried to achieve this with a single loop, but splitting it into two loops is definitely an improvement. That solution hadn’t occurred to me. Sometimes we don’t see the forest for the trees.

I still remember that challenge (I implemented a histogram-based search for single-precision floats, but lost this code, may be you still have it). But in this case, the fastest way to get the median (I only need it for 3…8 elements) is with a Sorting Network. The good thing is that here don’t need to perform a full sort. For example, for 4 elements, we need 5 comparisons and swaps, but the last swap can be omitted. This can also be parallelized well (the first two compare and swap and last two can be done in parallel).

I have a long weekend ahead and will see how far I can go on my end; an additional boost by a factor of 3x–4x should be possible.

altenbach · ‎05-28-2025

I still found a gigantic folder from the median challenge, but have not found the final code. Mostly useless debugging code with tons of extra indicators and probably tons of bugs. (Unfortunately, all original file dates are lost and I cannot tell what's newest...) 😮

I was not sure about the size of your arrays to get the median and assumed large arrays. Yes, for small arrays things, get faster, no matter what. Median (quickselect) is always faster than sorting (quicksort) because only one side of the pivot needs further processing. In many ways this is similar to your sorting network approach, but I have not studied in detail. Still, you need to be careful doing parallel comparisons because the outer loop is already parallelized. Hopefully, the compiler knows what to do. 😄

have a nice long weekend. We just had one here...

LabVIEW Champion.

altenbach · ‎05-28-2025

Just doing U16 sort&index is about 10x faster than the original.

LabVIEW Champion.

LabVIEW

Annoyingly inconsistent execution times for averaging operation

Re: Annoyingly inconsistent execution times for averaging operation

Re: Annoyingly inconsistent execution times for averaging operation

Re: Annoyingly inconsistent execution times for averaging operation