LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

memory manager optimization

I'm carrying a pre-allocated U8 buffer array on a shift register, dumping the 'active' portion of it to disk when it is getting 'close to full'.

 

My overall goal is to reduce calls to the memory manager as much as possible for this particular code.  I insert data into the buffer array using in-place "array split / replace sub-arrays" which seems like a good idea(?), but when I go to get the 'data' portion of the buffer, am I better off using the in-place to get the sub-array (the size of which is close to the max size of the buffer array itself), or is it better to use 'Array Subset'?

 

My thinking is that the 'in-place' might be better since I'm not creating a copy of the main array, while if I wire the main array to 'Array subset' I have a branch/split on that wire, which creates a (new) copy of the whole array (except buffer allocation seems to indicate I don't on the input, but I do on the output.. sort of opposite of the in-place structure)?

 

On the other hand, the in-place shows two buffer allocations; the total size equaling the size of the buffer array, and another allocation on the write binary file VI anyway...

 

I always have a hard time trying to 'reason' or 'think' myself to the ideal solution in this type of a problem. 😕  Pointers and insight would be appreciated.  Again, I'm trying to reduce dynamic memory alloc/dealloc as much as possible, not necessarily looking to maximize (cpu) performance.

in place or not.png

 

QFang
-------------
CLD LabVIEW 7.1 to 2016
0 Kudos
Message 1 of 50
(4,044 Views)

Unfortunately, I think somebody from NI will have to chime in here to get a definate answer.

 

But based on my limited understanding, the Array Subset will not make a copy of the entire array, but the subset you are getting.  Supposedly, the compiler is smart enough to allow the read functions for the array happen before the write functions when on the same wire (branches count).  If multiple write functions, then a copy will be make for that second write function.

 

I'm not completely sold on the IPES helping here based on what I just stated.


GCentral
There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines
"Not that we are sufficient in ourselves to claim anything as coming from us, but our sufficiency is from God" - 2 Corinthians 3:5
0 Kudos
Message 2 of 50
(4,032 Views)

I always use a construct like your first diagram, the in-place structure.

 

Cheers,

mcduff

 

 

0 Kudos
Message 3 of 50
(4,027 Views)

Good question.  I don't know the answer, and even if I said I did, why should you believe me (and who says that this would be true on your machine and with your version of LabVIEW)?

 

Besides, I'm a Scientist, so my answer would be "Do the Experiment".  Write a little test routine, put timing code around it, and see how long it takes to do a reasonable number of writes similar to what you'd want to do with your code.  Do it with the In Place method, then with the Subset method, and compare.

 

Several years ago, Darren Nattinger (who I believe still holds the title of The World's Fastest LabVIEW Coder) gave a very interesting talk at NI Week asking "Which way is faster?"  The audience voted among three alternatives, and in most cases, the correct answer (which he demonstrated by rinning the code with timers surrounding it) was chosen by the smallest nujmber of people.

 

You have to "do the experiment" ...

 

Bob Schor

 

P.S. -- when you've done it, let us know the results!

0 Kudos
Message 4 of 50
(4,026 Views)

I would lie to chime in if you are testing for memory.

 

Use the Windows Memory Task Manager to watch the memory when you do the two tests. Make array of 1 million or so of the data type you have, with 1 million points it should be easier to tell the memory usage. Check the task manager to see if any copies of the data are made. I did this in the past, do not remember which LabVIEW version, the IPE was better for memory, not sure about speed. I do not know if new versions have changed so case 2 uses the same amount of memory as case 1.

 

Cheers,

mcduff

 

 

0 Kudos
Message 5 of 50
(4,012 Views)

McDuff is clearly another Scientist, who says "Do the Experiment" ...  Lay On, McDuff ...

 

BS

0 Kudos
Message 6 of 50
(4,006 Views)

Oh I run 'experiments' all day long, trust me.. My issue here is that I am not aware of a good way to run a MEMORY test on an RT target.  I already know that IPE often (but not always) take LONGER to run than not using it... But I am at a loss for how to test memory allocation/deallocation (on an RT system).  

 

I'm actually oposite of mcduff when it comes to usage of IPE's.  A year ago or so, I started making extensive use of IPE's, until I found that it really ate up a lot of CPU, at least in the cases and in the manner I tried to use them back then, so my general take is to never use them unless there is a very good reason/chance that it will actually be beneficial.

 

Even on a Windows platform, I'm not quite sure how I would get detailed information.  Keep in mind, memory usage isn't the exact metric I'm looking for.. I'm looking for if the memory is re-used without calls to the memory handler (on an RT target) or not.  blergh.. I feeling like I'm doing a poor job explaining what I mean. sorry guys.. 

QFang
-------------
CLD LabVIEW 7.1 to 2016
0 Kudos
Message 7 of 50
(3,988 Views)

According to the LabVIEW 2014 Profile Perfromance and Memory ...

 

This Case 1 snippet is the best

Case1.png

 

Strangely, without the DVR it is worse than the array subset, snippet below

 

Untitled4.png

 

I think you need to test in your application, ie, in a subVI with an array in and array out, if everything is inlined than not copies should be made.

 

Cheers,

mcduff

 

0 Kudos
Message 8 of 50
(3,985 Views)

By the way, the reason I'm taking such a keen interest here, is that I'm reworking a 'system log engine' that runs on my RT targets and logs messages along with time and system memory/cpu statistics to a file from time to time.  The old version of the engine carried a string on a shift register that it kept bundling to until it hit a certain threshold and the string was written to file.. this would create a (noticeable) 'ramp' effect over time on memory usage on the RT targets.  Now, I'm also chasing and testing for memory leaks on these targets as they go in the field and are expected to operate 24/7 for hundreds of days.  The task of positively identifying the presence or absence of a memory leak is MUCH simpler if my code overall is 'quiet' and 'consistent' in its memory usage.  As such, most my RT code make use of pre-allocated arrays and other techniques to reduce and prevent memory allocations and de-allocations.. This makes even smaller (4 byte references anyone) leaks stand out much sooner in my testing.   This system log engine is one of the largest/last sources of 'noise' on my application memory usage. 🙂

 

Also, it is already better than the old version of this system logging loop.. now I'm just going full on OCD to make it as good as I can while I'm in there tinkering with it.. Once done, it is published on our VIPM repository so I want it as 'good as I can get it'. 🙂

QFang
-------------
CLD LabVIEW 7.1 to 2016
0 Kudos
Message 9 of 50
(3,973 Views)

Mcduff, could you snag a screen shot of the Profile performance and memory result for Case 1 and Case 2?  What metric do you use to say case 1 is 'best'? (sorry if that sounds like a retarded question, I'm just curious how you made that determination of 'best'?)

QFang
-------------
CLD LabVIEW 7.1 to 2016
0 Kudos
Message 10 of 50
(3,969 Views)