Underperforming LabVIEW compiler when it comes to buffer allocations

AnthonV · ‎06-21-2007

Have you ever noticed that even though the LabVIEW dataflow paradigm enforces code to be executed in a specific order, you are able to get huge (depending on you cluster or array sizes) improvement in processing speed if you enforce order by using a sequence diagram. In many cases you can elliminate the buffer allocations that are (unneccesarily) made by the compiler if you don't rely on dataflow and rather use sequence diagrams.

See that attached images. 'Without.png' shows the code as one would normally wire it - notice the buffer allocation indicated by the red circle. This cluster of mine is very big with large arrays in it so this allocation is very costly. 'With.png' is exactly the same code just with order enforced using a sequence. Notice there is no buffer allocation made by the compiler here. My cycle time on this diagram reduced from 53ms to 5ms (!), just by adding the sequence.

I was hoping that 8.2.1 would be better at this than 7.1 but it doesn't seem so.

Any ideas or am I missing something?

altenbach · ‎06-21-2007

Why don't you attach your VI and let us play with it? 🙂

LabVIEW Champion.

altenbach · ‎06-21-2007

btw: what's happening in the other case of the case structure?

LabVIEW buffer allocations usually tend to err on the safe side.

Still, I think you have way too much code. If I understand you right, all you really need is to loop over the "controls" array with a small loop. (You might even autoindex and blindly set all to FALSE). Everything else belongs outside the loop, with a single "bundle by name" at the end to feed the controls array back into the cluster.

LabVIEW Champion.

AnthonV · ‎06-21-2007

There is obviously some other code that isn't shown here that is executed when the flag is set so what you suggest won't work for me unfortunately. I'm selecting specific elements of the array and checking the flag, if it is set I reset it and perform some operation on the rest of the element, place it back into the array and then bundle it back.

In any case, I've attached a simplified VI that shows the effect on performance. Select which method with the 'method button'. The only difference between the two methods is the use of dataflow (incorrectly resulting in a buffer allocation) or an explicit sequence. In this case on my PC (3.8GHz Pentium running XP with 4GB RAM) the execution time is 1540ms for the dataflow method and 1ms for the sequence! Maybe there is something to be said for the compiler not to 'err on the safe side'.

Let me know what you think - maybe I'm missing something. Going home now so I'll pick up your comments tomorrow morning.

tst · ‎06-21-2007

@AnthonV wrote:

Maybe there is something to be said for the compiler not to 'err on the safe side'.

Not really. It's much better for such an error to slow down your application then for it to produce wrong results and I can find all kinds of examples. Inplaceness is complicated and it's possible that what you're seeing here is a missed corner in the algorithm.

For a couple of threads demonstrating the complexity (and explaining some more) you can have a look here and here.

___________________
Try to take over the world!

altenbach · ‎06-21-2007

I would think this is a real bug.

I simplified the code a bit. Using the code in the image (other case is wired straight through), the absence of the 1-frame case sequence causes a 1000+ fold slowdown. (no other change in the code!).

(I also tested with all TRUE for "Changed" in the generated data)

Can you be a bit more specific what else you need to change in the cluster?

Message Edited by altenbach on 06-21-2007 01:18 PM

LabVIEW Champion.

mikeporter · ‎06-21-2007

Ditto on the bug designation. LV has a long history of inplaceness working funny inside loops - and this looks like another one...

Mike...

Certified Professional Instructor
Certified LabVIEW Architect
LabVIEW Champion

"... after all, He's not a tame lion..."

For help with grief and grieving.

altenbach · ‎06-21-2007

@AnthonV wrote:

There is obviously some other code that isn't shown here that is executed when the flag is set so what you suggest won't work for me unfortunately. I'm selecting specific elements of the array and checking the flag, if it is set I reset it and perform some operation on the rest of the element, place it back into the array and then bundle it back.

See, you do all operations within a single array element! There is absolutely no need to constantly unbundle and rebundle the array from the parent cluster. Just unbundle the "controls" array, do whatever you need to do inside the loop, and at the very end bundle it back to the parent cluster.

Here s a quick very simple example of what I had in mind. It checks if "changed" is TRUE, and if so, resets "changed" to FALSE and also sets state=3. The possibilities are unlimited, do whatever you need to do to each array element. Once you have edited the entire array, and the FOR loop has finished, bundle it back into the parent. No outer loop! No shift registers!

If you only have selected elements, autoindex on the list of indices and basically do the same with "index array" and "replace array element".

Message Edited by altenbach on 06-21-2007 06:28 PM

LabVIEW Champion.

altenbach · ‎06-21-2007

@altenbach wrote:
If you only have selected elements, autoindex on the list of indices and basically do the same with "index array" and "replace array element".

Here's what I had in mind.

Message Edited by altenbach on 06-21-2007 06:42 PM

LabVIEW Champion.

AnthonV · ‎06-21-2007

Thanks for the detailed feedback. Unfortunately it seems I will have to keep on bundeling inside the loop as unbundeling the array and autoindexing also creates copies at every iteration causing the following results: bundeling inside the loop: 1ms; bundeling outside the loop: 1200ms (see attached picture). I agree that what you suggest is the most elegant and esthetically pleasing solution, but in an embedded application (or any other for that matter) it is unacceptable.

Any programmer that has any need for time-critical code will have to regularly do a 'show buffer allocations' and elaborate their code with sequences and less-than-esthetically-pleasing sections to eliminate unrequired buffer allocations.

I have often had LabVIEW programmers tell me that LabVIEW should not be used for performance-critical tasks but rather for ease-of-use. I disagree, with a little bit of effort and some elaboration one can get incredible speed-ups of your code - as long as you eliminate non-obvious buffer allocations. And it would be wonderful if the compiler could do this more intelligently - in fact I expect this from the compiler and I hope it will do better in future versions if LabVIEW wants to compete mainstream.

LabVIEW

Underperforming LabVIEW compiler when it comes to buffer allocations

Underperforming LabVIEW compiler when it comes to buffer allocations

Re: Underperforming LabVIEW compiler when it comes to buffer allocations

Re: Underperforming LabVIEW compiler when it comes to buffer allocations

Re: Underperforming LabVIEW compiler when it comes to buffer allocations

Re: Underperforming LabVIEW compiler when it comes to buffer allocations

Re: Underperforming LabVIEW compiler when it comes to buffer allocations

Re: LabVIEW compiler bug

Re: Underperforming LabVIEW compiler when it comes to buffer allocations

Re: Underperforming LabVIEW compiler when it comes to buffer allocations

Re: Underperforming LabVIEW compiler when it comes to buffer allocations