02-20-2019 06:31 AM - edited 02-20-2019 06:34 AM
@crossrulz wrote:
wiebe@CARYA wrote:B being faster then C? Weird!
They are within the noise. I would expect them to actually compile to the same thing due to optimizations that the compiler does such as moving that first multiply outside of the loop.
Within the noise? I disagree.
I did all of those measurements repeatedly. They did not differ more then 10% (more like 1%) between measurements.
And those measurements where already the result of 100 iterations.
Never mind, I thought you mend all measurements where within the noise. Still, those results where pretty reproduceable, So I do thing B and C differ.
I guess we could have a look at the DIFR or assembler code that it compiles to... But I also have to get some work done every now and then .
02-20-2019 07:49 AM
Also, did you set the VI to Subroutine priority and close diagrams before running? Getting the correct measurement can be trickier than you think at first. Also, there's a Mean.vi that calculates the mean/average. 😉
/Y
02-20-2019 08:35 AM - edited 02-20-2019 08:37 AM
@Yamaeda wrote:
Also, did you set the VI to Subroutine priority and close diagrams before running?
No. Lots of room for experimenting with options. But most of my code doesn't run as subroutine (and I find that usually it doesn't improve the overall speed at all), so for me a bench using it wouldn't be representative. As for the front panel, as long as nothing is updated in the loop, I doubt it makes much difference. My guess is it all stays relative.
I found in the past that even making a change, saving and closing the VI benchmarks showed N to be twice as fast as M, only to find after a restart of LabVIEW that M was twice as fast as N...
@Yamaeda wrote:
Also, there's a Mean.vi that calculates the mean/average. 😉
I know, but this is more intuitive for me. I actually used it, before replacing it with the median. Then did it manually on that result. Go figure...
02-20-2019 11:24 AM
@Yamaeda wrote:
That it's slower when you wire the scalars on top isn't that surprising. There's some threads on LV memory management, and if you wire an array on top it usually work in place, while a scalar on top forces it to create a new array.
I seriously doubt that wire order makes a difference. The compiler knows how to order things optimally.
(I think decades ago, it made a difference, but in my experience it does not matter any more. Your theory could easily be tested by looking at buffer allocation dots).
02-20-2019 12:46 PM
wiebe@CARYA wrote:Here's my test bench. And it's tricky, Sometimes any change makes the time double, and a random change makes it go back.
Seems most of your results are quantized to ~1x, ~2x, ~3x of your fastest, maybe hinting at SSE differences. Hard to test.
In any case, a really (really!) smart compiler could get at the gist of your algorithm and substitute something equivalent that is yet another 50x faster 😄 We can dream! 😄
02-21-2019 01:48 AM
@altenbach wrote:
wiebe@CARYA wrote:Here's my test bench. And it's tricky, Sometimes any change makes the time double, and a random change makes it go back.
Seems most of your results are quantized to ~1x, ~2x, ~3x of your fastest, maybe hinting at SSE differences. Hard to test.
In any case, a really (really!) smart compiler could get at the gist of your algorithm and substitute something equivalent that is yet another 50x faster 😄 We can dream! 😄
Yes, algorithmic optimizations traditionally are orders of magnitude higher then simply optimizing instructions. Guess we have to wait for AI to take over...