08-23-2013 08:47 AM
@crossrulz wrote:
Ok, one final benchmark in attempt to put this DLL call slow business to sleep. Here is how I figured NI is doing the average: Sum(X/size(X)).
Exactly. NI is doing the right (but more expensive!) thing to avoid overflow for large numbers. Doing N divisions instead of 1 simply costs more. 😉
If we compare apples with apples, the DLL call is significantly faster.
(Personally, I would turn the constants into controls, place the indicators after the sequence structure and disable debugging, but it does not really make a difference here.)
08-23-2013 09:17 AM
Out of curiosity I wanted to implement an "almost" correct version in LV by utilising the size of the array and a scale by power of 2. I was still several times slower than the NI VI.
I then realised there was NO primitive to simply scale a Floating point number (FXP yes, SGL, DBL etc. no). The "Scale by power of 2" does some extra work (Again probably due to the IEEE Floating point standard).
Does LV have no way of manipulating the mantissa directly in this way?
Shane.
08-23-2013 09:29 AM
Also if you show more decimal digits, you'll see that the two answers are not identical.
Shane
08-23-2013 09:41 AM
altenbach ha scritto:
@crossrulz wrote:
Ok, one final benchmark in attempt to put this DLL call slow business to sleep. Here is how I figured NI is doing the average: Sum(X/size(X)).
Exactly. NI is doing the right (but more expensive!) thing to avoid overflow for large numbers. Doing N divisions instead of 1 simply costs more. 😉
If we compare apples with apples, the DLL call is significantly faster.
(Personally, I would turn the constants into controls, place the indicators after the sequence structure and disable debugging, but it does not really make a difference here.)
Actually things seem to be even more complicated. The Sum(X/size(X)) would fail when averaging values very near to the double precision lower range.
But although the simple Sum(X/size(X)) returns 0, the Mean vi still returns a nonzero value.
08-23-2013 11:12 AM
@pincpanter wrote:
altenbach ha scritto:
@crossrulz wrote:
Ok, one final benchmark in attempt to put this DLL call slow business to sleep. Here is how I figured NI is doing the average: Sum(X/size(X)).
Exactly. NI is doing the right (but more expensive!) thing to avoid overflow for large numbers. Doing N divisions instead of 1 simply costs more. 😉
If we compare apples with apples, the DLL call is significantly faster.
(Personally, I would turn the constants into controls, place the indicators after the sequence structure and disable debugging, but it does not really make a difference here.)
Actually things seem to be even more complicated. The Sum(X/size(X)) would fail when averaging values very near to the double precision lower range.
But although the simple Sum(X/size(X)) returns 0, the Mean vi still returns a nonzero value.
Nice catch. My benchmarking confirmed this. So NI is somehow taking care of all the weird situations for us. Does anybody have any ideas for an algorithm that can take care of both the small numbers and the really high numbers without over/underflowing to INF/0? Not like it really matters. I'm sure that code has been tweaked and optimized as much as can be.
So the moral of the story is to just use the built-in Mean.vi unless you will for sure not be dealing with large numbers AND you need the mean to be done very fast for whatever reason.
08-23-2013 02:56 PM
08-24-2013 06:10 AM
here is the real time to call a dll
approximately 43µs (Q6600 XPsp3 2.6Gh 4Go)
08-24-2013 10:36 AM
@ouadji wrote:
here is the real time to call a dll
approximately 43µs (Q6600 XPsp3 2.6Gh 4Go)
As always, please attach your entire benchmarking code, not just the dll. We cannot see how the CLFN is configured.
08-24-2013 11:23 AM
i don't understand why you cannot see how the CLFN is configured (?)
It's a snippet ... I just do it now .. no problem.
the file is attached.
08-24-2013 11:40 AM
@altenbach wrote:
Exactly. NI is doing the right (but more expensive!) thing to avoid overflow for large numbers. Doing N divisions instead of 1 simply costs more. 😉
If we compare apples with apples, the DLL call is significantly faster.
This leads to the question of how parallellized the DLL is.
/Y