LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Dr. Damien's Development - LabVIEW 2009 Data Value Reference

 

I’m trying to optimize an application which uses several megabytes of data in arrays of clusters and strings and I was hoping to use the Data Value Reference functions to speed things up. This has certainly helped where an array is passed unchanged through several sub VIs but I would really like to index a big array by reading the elements from the original buffer belonging to a sub VI. The problem with the ‘Data Value Reference Read Element Border Node’ is that it makes a new buffer allocation where I would really like some way to index an element from the original buffer.

24066iBA694EA6620325EB

 

The code below runs about 40 times faster (LabVIEW2009) than the example above presumably because the buffer is not being allocated on every iteration.

24074i4982D90838129DF1

Is there a way to index an array directly from the data value reference?

Message 11 of 18
(3,216 Views)

I ran the code through desktop execution trace looking for memory operations and It doesn't look like the DVR dereference is causing a memory allocation.  If I set a breakpoint at the end of each loop, I see a 1MB allocation at the beginning of the loop, and 1MB freed somewhere in each case.  The action engine case has an additional Alloc/Free pair.

 

I understand that the Profile buffer dots aren't always right, and sometimes they are maybes.

 

My LV2009sp1 times are approx:

 

 

Reference = 164ms
Action Engine = 394ms
Single-Element Queue = 742ms
Data Value Reference = 446ms
Data Value Reference outside loop = 199ms
Data Value Reference outdise loop normal index = 174ms
DVR AE = 629ms
Basic = 181ms

 

Reference = 164ms (baseline)

Action Engine = 394ms

Single-Element Queue = 742ms

Data Value Reference = 446ms

Data Value Reference outside loop = 199ms

Data Value Reference outdise loop normal index = 174ms

DVR SubVI = 629ms

Basic = 181ms

 

Times stay roughly the same for 1k and 100MB.

Times are about 50ms less each in 2010.

 

the DVR subvi case merely handles the DVR read/write in a subvi in a for loop, like the normal DVR, just packaged.  DECT shows that it closes the same handle that the DVR opens in the toplevel.  I'm willing to bet that the ~250ms time difference that I see reading the DVR a million times vs. once is the overhead from its mutex behavior.

 

The most interesting thing is that my relative times are much different than DFGray's: AE and DVR roughly the same, SEQ lagging a bit.

-Barrett
CLD
0 Kudos
Message 12 of 18
(3,187 Views)

If you wrap the DVR around the loop instead of doing it in the loop, you will probably see a much faster response.  The reason for this is that the DVR acquires a global mutex (semaphore, if you will) to prevent anything else accessing the data every time it is called.  This is a performance issue, and your example shows it.

 

The buffer viewer is conservative.  The buffer it shows coming out of the DVR dereference terminal is statically allocated and an empty data structure used under error conditions (e.g. the DVR input is invalid).  The gold standard is still using your OS memory monitor (Task Manager in Windows) to dynamically watch LabVIEW's memory use as you single-step through code.  Doing this confirms the "fake" memory buffer.

0 Kudos
Message 13 of 18
(3,160 Views)

 


blawson wrote:
Reference = 164ms
Action Engine = 394ms
Single-Element Queue = 742ms
Data Value Reference = 446ms
Data Value Reference outside loop = 199ms
Data Value Reference outdise loop normal index = 174ms
DVR AE = 629ms
Basic = 181ms

 

Reference = 164ms (baseline)

Action Engine = 394ms

Single-Element Queue = 742ms

Data Value Reference = 446ms

Data Value Reference outside loop = 199ms

Data Value Reference outdise loop normal index = 174ms

DVR SubVI = 629ms

Basic = 181ms

 ...

The most interesting thing is that my relative times are much different than DFGray's: AE and DVR roughly the same, SEQ lagging a bit.



 

I saw similar results, but the action engine consistently outperforms the DVR.  Then if I wrap the DVR and SEQ into subvis and in-line them all, the AE is another 2x faster

 

Test VI DataValueReferenceDemo-1.vi in lv2010, as downloaded:

 

Reference = 120ms

Action Engine = 228ms

Single-Element Queue = 469ms

Data Value Reference = 318ms

Data Value Reference outside loop = 106ms

Data Value Reference outdise loop normal index = 106ms

Basic = 106ms

 

 

starred cases have subvis running with new in-lining feature

 

Reference = 114ms

* Action Engine = 95ms

* Single-Element Queue = 443ms

* Data Value Reference = 293ms

Data Value Reference outside loop = 102ms

Data Value Reference outdise loop normal index = 101ms

Basic = 99ms

 

 

Not sure why the AE is even faster than "basic", but it is!

 

 


@Zing wrote:

 

Is there a way to index an array directly from the data value reference?


I don't think so, but it sure would be nice.  DVRs do seem to be a great way to protect your data, but they are not the end-all solution for huge datasets, especially arrays of clusters.

 

Message 14 of 18
(3,115 Views)

^ remember though that the AE adds a memory buffer.  This is a show-stopper if you're pushing around a lot of data.

 

I didn't really play around with execution flavors and inlining.  does an inlined FG not add a memory buffer?  Is inlining the FG in this example at all relevant to real-world performance?

 

I also noticed that occasionally the AE would take about 4x as long, repeatably, until I restarted labview.  I believe that the main thing that affects the speed of the AE is how fast your system is able to make the buffer copy.  If you've maxed out your free RAM, it will take more time than if you have plenty of room, and also this will vary significantly from system to system.  I don't think the other methods depend so much on RAM performance, which could explain why you, me, and Damien all see different relative times.

-Barrett
CLD
0 Kudos
Message 15 of 18
(3,104 Views)

 


@blawson wrote:

^ remember though that the AE adds a memory buffer.  This is a show-stopper if you're pushing around a lot of data.

 

I didn't really play around with execution flavors and inlining.  does an inlined FG not add a memory buffer?  Is inlining the FG in this example at all relevant to real-world performance?

 

I also noticed that occasionally the AE would take about 4x as long, repeatably, until I restarted labview.  I believe that the main thing that affects the speed of the AE is how fast your system is able to make the buffer copy.  If you've maxed out your free RAM, it will take more time than if you have plenty of room, and also this will vary significantly from system to system.  I don't think the other methods depend so much on RAM performance, which could explain why you, me, and Damien all see different relative times.


 

A memory buffer does not equal a data copy.  I don't think my version could have AE executing so quickly with a data copy.

 

If the diagram code is only reading the data, which is common for large datasets, then the labview compiler can often determine that the data will not be modified and a copy will not be made.  However you may still see a buffer allocation on the diagram.  That's based on my limited understanding , since the documentation is pretty sparse about buffer allocations vs. data copies.  (You may want to read the vaguely related topic http://lavag.org/topic/7307-another-reason-why-copy-dots-is-a-bad-name-for-buffer-allocations).

 

I would agree that in-lining shouldn't have any effect on memory buffers, unless by in-lining the code, the compiler is able to see that a copy is unnecessary whereas it might not be able to tell if the array goes into a subvi.  That all depends on how much their compiler can optimize through different subvi calls.

 

 

 

 

 

 

 

 

 

0 Kudos
Message 16 of 18
(3,092 Views)

I think you hit the nail on the head there.

 

From watching the memory performance on my machine, I'm quite certain that the AE creates a data copy (but only one).  Then again, it could be the way I coded it or the fact that I was using 2009 (and not 10's compiler).

-Barrett
CLD
0 Kudos
Message 17 of 18
(3,014 Views)

 


@jdunham wrote:

 


@blawson wrote:

^ remember though that the AE adds a memory buffer.  This is a show-stopper if you're pushing around a lot of data.

 

I didn't really play around with execution flavors and inlining.  does an inlined FG not add a memory buffer?  Is inlining the FG in this example at all relevant to real-world performance?

 

I also noticed that occasionally the AE would take about 4x as long, repeatably, until I restarted labview.  I believe that the main thing that affects the speed of the AE is how fast your system is able to make the buffer copy.  If you've maxed out your free RAM, it will take more time than if you have plenty of room, and also this will vary significantly from system to system.  I don't think the other methods depend so much on RAM performance, which could explain why you, me, and Damien all see different relative times.


 

A memory buffer does not equal a data copy.  I don't think my version could have AE executing so quickly with a data copy.

 

If the diagram code is only reading the data, which is common for large datasets, then the labview compiler can often determine that the data will not be modified and a copy will not be made.  However you may still see a buffer allocation on the diagram.  That's based on my limited understanding , since the documentation is pretty sparse about buffer allocations vs. data copies.  (You may want to read the vaguely related topic http://lavag.org/topic/7307-another-reason-why-copy-dots-is-a-bad-name-for-buffer-allocations).

 

I would agree that in-lining shouldn't have any effect on memory buffers, unless by in-lining the code, the compiler is able to see that a copy is unnecessary whereas it might not be able to tell if the array goes into a subvi.  That all depends on how much their compiler can optimize through different subvi calls.


To my understanding, in-lining doesn't affect the total memory buffer for an object, however it does affect how much memory is buffered at a given time.

 

So if (for example) an array was 500 elemenets (each of which being I8) it would only have to buffer enough memory for a single I8 rather than

500 x I8 for an entire copy of that array. This could be very useful when doing manipulation of large amounts of data.

Cory K
0 Kudos
Message 18 of 18
(3,002 Views)