LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Can I increase the execution speed of this VI?

Solved!
Go to solution

This is a simple question - is there anyway to increase the execution speed of the VI below?  I am parallelizing the outer loop but given the sequential nature of the inner loop, I am not sure if there is anything else I can do.  The problem is that I am attempting to crunch a lot of data - each second I am producing about 1.25 MS/s and then attempting to accumulate this over 10 s (Data In) to perform the calculations below over several integration periods (iTime; currently 0.001, 0.01, 0.1, 1, and 3 s) - and it is causing an overrun on some other loops.  Any help is appreciated.

 

Allan Variance.png

0 Kudos
Message 1 of 16
(5,264 Views)

Two places I would start:

 

Use array subset instead of Delete from Array.

Write your own Mean Vi which uses SGL precision.

0 Kudos
Message 2 of 16
(5,256 Views)

Hmmm....doesn't using Array Subset incur a calculation that Delete from Array doesn't?  I will have at least a multiplication node (i*chunk size) to add to the inner loop that was not there before to maintain the proper place.  On the other hand, that gets rid of the shift register.  Is there a significant difference in terms of performance between Array Subset and Delete...?

 

It looks like you are correct about the single precision math - can be considerably faster.  I will look into this.

 

Thanks, Darin.

 

Matt

0 Kudos
Message 3 of 16
(5,250 Views)
Solution
Accepted by topic author cirrusio

@mtat76 wrote:

Hmmm....doesn't using Array Subset incur a calculation that Delete from Array doesn't?  I will have at least a multiplication node (i*chunk size) to add to the inner loop that was not there before to maintain the proper place.  On the other hand, that gets rid of the shift register.  Is there a significant difference in terms of performance between Array Subset and Delete...?

 

It looks like you are correct about the single precision math - can be considerably faster.  I will look into this.

 

Thanks, Darin.

 

Matt


You should test in LV12, in previous versions Delete From Array is a dog, even when deleting from the end of the array.  I did a simple test, create an array of 50000 random numbers and take the average of consecutive chunks of 400 elements using both methods.  I even added a Reverse Array to the Subset test to match the behavior of Delete from Array. 

 

Results:

 

Delete:  2.7 msec

Subset: 200 usec

 

your mileage may vary.

Message 4 of 16
(5,236 Views)

@mtat76 wrote:

Hmmm....doesn't using Array Subset incur a calculation that Delete from Array doesn't?  I will have at least a multiplication node (i*chunk size) to add to the inner loop that was not there before to maintain the proper place.  On the other hand, that gets rid of the shift register.  Is there a significant difference in terms of performance between Array Subset and Delete...?


Yes, there is. Using Array Subset, LabVIEW can create a "sub-array" that's just a pointer to the start location within the original array, along with a length of the subset. The data in the array itself doesn't need to be moved or copied, and you'll be able to run the entire loop without ever making a copy of Data In. Right now, with the shift register and Delete from Array, every iteration of outer loop needs to make a new copy of Data In to reinitialize the inner shift register.

As an alternative to converting to single-precision math, you might also get a speed increase from converting Data In to double-precision once, before it enters the outer loop.

Message 5 of 16
(5,230 Views)

Wow!  Thanks, Darin.  I don't really  understand why, but the single precision mean performs at about a rate of 70x faster than the double precision mean provided by LV.  That is remarkable!  Here is the VI that I used to test:

 

Test mean.png

 

Fairly straightforward.  I am going to reexamine some of my other routines - we were getting tight on computational power anyway and this is a big hit if you are doing this over and over again but don't need the precision.

 

Cheers, Matt

0 Kudos
Message 6 of 16
(5,197 Views)

You may want to redo your test, making sure that the two calculations don't happen in parallel. As you have it now, the speed comparison is not necessarily very accurate.

0 Kudos
Message 7 of 16
(5,194 Views)

Here's a much better version of your speed test. On my machine, the calculation time for DBL is almost exactly twice as long as for SGL. Interestingly, if you remove the conversion to SGL, it takes the native LabVIEW version (array sum divided by array size) about the same amount of time as the Mean VI (which calls a DLL).

benchmark mean.png

(EDIT: sorry, in my initial post I wrote "fast" where I should have written "long" in the second sentence. I've now corrected that.)

Message 8 of 16
(5,179 Views)

You had me worried there for a second, Nathan.  I am still getting 60-70x on an RT PXI chassis (quad-core 8110).

 

Test mean.png

 

Here is the output to the graph Ratio in the code above.  The DBL calculation is running at about 600 ms while the SGL calculation is running at about 8 ms.  What are you getting?

 

Ratio Time.png

0 Kudos
Message 9 of 16
(5,174 Views)

Please try the code I posted (it's a snippet, you can simply drag it to your desktop and from there to a block diagram, no need to rewrite or change anything).

I get about 12ms for DBL and 6ms for SGL.

 

There are several very important things about the way the benchmark is arranged in my VI (thanks, Altenbach, for the tips on this board over the years about this!). Nothing happens in parallel with the sequence structure, so that while the sequence structure is executing, it is the ONLY thing that is executing. That includes updating front panel controls, which can be very slow. All the inputs enter the sequence structure in the first frame, and all outputs exit from the last frame. I'd need to find the reference, which I don't have time to do right now, but I believe frames in a sequence structure can execute as soon as that frame has all its data available (and the preceding frame has executed), even if following frames cannot yet execute; and items outside the sequence structure can execute as soon as data is available from that frame even if the rest of the sequence hasn't yet executed (in your VI, the time calculations below the sequence structure).

Message 10 of 16
(5,169 Views)