LabVIEW is overzealous in copying clusters

TCPlomp · ‎03-30-2007

I've seen recently something similar, and combined with the bug that negative I32 were casted into very large U32, I had about 5 convert to I32 in code with 10 wires...

.

Somhow the initial setup of the shift register (LV decides what data-type it should be) didn't change correctly!

Ton

Free Code Capture Tool! Version 2.1.3 with comments, web-upload, back-save and snippets!
Nederlandse

LabVIEW user groep www.lvug.nl
My LabVIEW Ideas

LabVIEW, programming like it should be!

Ben · ‎03-31-2007

Belzar,

When the cluster is used to control the case structure, the numeric IN THE CLUSTER is used to case. Include "scalers" in the buffer allocation options to see that there is no buffer created for the numeric in case 1.

Now add a increment inside any of the cases and you will now see that LV copies the value from the cluster so it can be incremented. Since the cluster operation is clearly now a read on a wire branch, and it can be scheduled to happen before the case, the cluster's SR storage can subsequently be re-used.

Now for more mind twisting...

If the SR is ititialized from a control on the icon connector and returned via a indicator, the callers buffer can be re-used.

Ben

Message Edited by Ben on 03-31-2007 12:54 PM

Message Edited by Ben on 03-31-2007 12:56 PM

Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel

dan_u · ‎03-31-2007

Funny... Method 1A, although more "complex", is more efficient... even if the control is replaced with a constant (like in Method 1) no buffer allocation for the cluster is required in the case structure... at least that's what the buffer allocation says.
So sometimes it might be better to force a copy of a scalar in order to prevent copying a whole cluster... at least that's my interpretation of the tests.

Daniel

Ben · ‎04-01-2007

Daniel wrote;

"

So sometimes it might be better to force a copy of a scalar in order to prevent copying a whole cluster...

"

Exactly!

When to force the copy?

Under normal circumstance, only when you have to.

When do you have to?

When resources are limited, either CPU or memory. Both of theses situtation often arise in RT applications. The buffer allocations and other tools are very handy for "seeing" through the glass darkly.

As I wrote in this thread.

http://forums.ni.com/ni/board/message?board.id=170&message.id=231476#M231476

Learning to master LV is like trying to learn to surf. Subtle gestures can make a big difference in performance. Branched wires sometimes take extra effort to help LV understand what you want.

The buffer allocation tool lets LV show you how it is interpreting your gestures. If you continue to play with variations of code that are driven fron the Unbundle by name, you will see that any operation that tells LV that you do not want to work in the SR, will force the scaler to be copied out before the case structure allowing its (the case structure's) optimization code to use the SR to do its work.

The optimization of various diagram constructs vary but they generally attempt to work in place. Since a read of a the scalar from the cluster in the SR can be performed without the need to copy it, the unbundle by name can claim that it wants to use the contents of the SR. Parallel to it (LV apparently does not recognize that the scalar is no longer required when the case runs) the case structure would like to work in the SR, but the SR is already claimed by the Unbundle.

When we force LV to work outside the SR by performing an operation that modifies the input, the SR is no longer used by the unbundle. Since the SR is unclaimed, the case sructure can do all of its work in the SR. In fact in my example 1A all o fhe work is done in the controls buffer.

Now one could wonder

WHY the unbundle by name takes precedence over the case structure for use of the SR?

That answer is beyond me.

and

Couldn't LV look through the unbundle and see that the SR contents are no longer required when the Case executes?

To this I would venture a guess that it has to do with the technicality that all o f the inputs to a structure must be satisfied prior to that structure executing. With the unbundle construct in example 1, the case required two inputs both the scalar (inside the SR) AND the cluster input.

Disclaimer:

Most of the above is is just my opinion. I have no documentation that backs-up any of the above. If anyone wants to twist anything I have said to account for more situations, be my guest!

Ben

Message Edited by Ben on 04-01-2007 12:05 PM

Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel

dan_u · ‎04-02-2007

Thanks Ben for your detailed analysis. I use the "show buffer allocations" often to optimize RT code. But I would never have thought of forcing copies to prevent larger copies. We usually use arrays (indexed with an enum (type def) showing the element meanings for easy code readability and modification) instead of clusters on RT. Probably arrays are handled more efficiently.

It looks to me (also when looking on the other thread with Shane) that there's room for compiler optimizations

For now I agree with you that we have to try to "help LV understand what you want".
Daniel

Ben · ‎04-02-2007

You are welcome Dan.

I have requested support follow-up on this thread to set the record straight.

Ben

Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel

Belzar · ‎04-02-2007

Thanks for the feedback everyone. I like your idea of an array indexed by an enum.

In my debugging I've made several other discoveries about efficient RT coding with respect to allocation:

1) avoid "typedef" calls - these have cause mysterious 5 usec delays to appear in the code as it goes off for memory. The real mystery is that "show buffer allocations" does not show these allocations. These are simple typedefs as well - as in int to an enum. In my quest to make the conversions obvious/intentional in the code I inadvertently added execution time. Coercion dots don't have the same overhead. This doesn't make sense and reeks of a bug. If I turn on "Show Wait for Mem Spans" in Execution Trace Tool I see these waits, but no hint in Show buffer allocations.

2) there are some buffer allocations that you cannot convince LV not to do. I have an array that is modified by replace array subset and then passed to a call to RT fifo write (inside a loop using a cluster SR). No matter how I code it, I either get a buffer allocation at the replace array subset or at the RT fifo write call. Frustrating to say the least because it is unnecessary.

3) Beware passing anything into a case statement. Especially don't pass in something like an array pulled out of a cluster. Pass the cluster in instead and pull the array out inside. Avoids many buffer allocs and copies.

4) Pray

LabVIEW

LabVIEW is overzealous in copying clusters

Re: LabVIEW is overzealous in copying clusters

Re: LabVIEW is overzealous in copying clusters

Re: LabVIEW is overzealous in copying clusters

Re: LabVIEW is overzealous in copying clusters

Re: LabVIEW is overzealous in copying clusters

Re: LabVIEW is overzealous in copying clusters

Re: LabVIEW is overzealous in copying clusters