LabVIEW Idea Exchange

cancel
Showing results for 
Search instead for 
Did you mean: 
Lavezza

Set initial memory allocation for arrays

Status: New

I looked for a similar idea, but I didn't' find anything. I find that hard to believe, so I may have missed it somehow.

 

LabVIEW has arrays the dynamically grow. This is great. Love it. It makes it very easy to do things like Loop A below.

 

But that can lead to performance issues as LabVIEW constantly has to allocate new memory as the array grows. However, and this is important, LV is smart enough to over allocate so that not every append results in new memory allocation. Somewhere in the LV code NI is keeping track of the size allocated and the size used.

 

If performance is a concern, we've always been told to do something like Loop B. Preallocate a big array and use Replace. But there are problems here. First, a lot of the built in array functions will return 'wrong' values (Array Size, etc.). Second, what if the initial size is just a best guess. If we need something larger (say 5050 elements), we have to have a special case to Append instead of Replace once we hit the initial allocation. BUT, LV is already doing all this internally, it just isn't exposed to us in a way that we can take advantage of.

 

So, I propose Loop C. (sorry, too lazy to create a good icon). This new primitive would return an empty array (like Loop A), but with memory already allocated for a much larger array (like Loop B). LabVIEW already has the code to handle this. So, if we know the array will grow to ~5000 we can preallocate that amount, have all the built in functions work and not have to worry about going over 5000.

Allocate Array.PNG

10 Comments
Knight of NI

Actually, using NaN will solve the issue you're showing in your screenshot:

 

Example_VI_BD.png

 

Obviously, this will only work with floats. For strings you could use an empty string as the "marker". For integers you would need to determine the best approach based on likely integer values. For example, using the max integer value could be one way. For the Array Min & Max you'd need to have it look at only the array that has been filled in so far.

 

As for the automatic grow, I'm not sure I understand what you're proposing.

Lavezza
Active Participant

NaN will still leave you with built-ins that don't work (like Add Array Elements). Arrays of Booleans won't work (no way to determine size, And Array Elements, etc.). You're empty string marker only works if empty string is not a valid value for your string array.

 

Basically you are proposing work-arounds to having the shift register in Loop B. What I am saying is that LabVIEW already has the code to handle this in a very elegant way, it just isn't exposed to use at the moment.

 

Here's an example (numbers totally made up, don't know the algorithm LV uses to determine initial size and size on new allocations)

 

New Empty Array
Allocation Size   Elements Used
2                       0
2                       1
2                       2
5                       3   <---- New memory allocation (slow). LV will allocate more than is needed to push off more allocations for a while
5                       4
5                       5
10                     6   <---- New memory allocation (slow). Oops, we're going to grow to ~5000. LV doesn't know this
10                     7
...


My Proposal
New Empty Array (initial allocation = 5000)
Allocation Size   Elements Used
5000                  0
5000                  1
... (no slow allocations for a while, if ever)

 

Mark_Yedinak
Trusted Enthusiast

If you are replacing elements every iteration it is an easy task to truncate the array to the size that is actually used. Also, the preallocation should be for a size that is over the amount of data you expect.Generally you should have a pretty good idea of what the maximum amount of data you will get is.



Mark Yedinak
Certified LabVIEW Architect
LabVIEW Champion

"Does anyone know where the love of God goes when the waves turn the minutes to hours?"
Wreck of the Edmund Fitzgerald - Gordon Lightfoot
Lavezza
Active Participant

Mark,

Truncating may be ok for the simple examples above. But in my application there are arrays that grow as new data devices are added to the system. It isn't a one-time build up. New elements can be added days after a program starts running. Since it is never 'finished', there is never a good time to truncate the array. And I agree we should allocate more than we need, but I don't want the program to be fragile to unforseen configs. If we normally have 4000 channels of data it would make sense to start with 5000. But if someone decides to start up a bunch of new devices and the channel count jumps to 7000, I want to program to be able to handle that (although there might be a performace hit).

Broken_Arrow
Active Participant

I like performance improvement ideas like this, and why not?

For a similar-but-different idea, AND maybe some inspiration for an Icon for this Idea, see my unpopular Idea here.

Richard






AristosQueue (NI)
NI Employee (retired)

This is one of those times where I see what your problem is and I agree LV could do better, but I'm not liking the particular solution you propose. Here's a completely different possible solution to the "how to preallocate the array" problem...

 

Instead of a new Array primitive, we could instead put a property on the While Loop to represent "probable minimum iteration count". We can find a better name than PMIC later if you like this idea. The PMIC would give LabVIEW a way to predict the size of arrays in shift registers, like the one shown in your Image A, and also arrays built using AutoIndexing tunnels. It would mean that you could write the code in Image A, which is the easy and obvious way to write that code, and then, if and only if you found that this particular loop was a performance bottleneck, you could set the PMIC. You get the advantage of the preallocation that you show in Image B but with minimal rewrite of your code.

 

Thoughts?

Lavezza
Active Participant

AQ,

 

Could you extend that to feedback nodes as well?

 

Here is a different example. We would use code like this in a LV2-style global. We have a string that acts as a key and a cluster of info for that key. We need fast lookups (1000's of keys per second), so we use variant attributes to hold the keys. The attribute value is the index of the info in an array of info. (Why don't we just use the info as the attribute data? Because the Get Attribute primitive takes a nosedive in speed if the data type is complex. Much faster to keep it a simple I32 and have the cluster data in a separate array).

 

We could pre-allocate the info array and have another feedback node to hold the 'next index' (in fact, we sometimes do). It would just be really nice if we didn't have to. Plus it complicates the code if we go past our predicited ceiling.

 

(This isn't real code. Our real code has error handling, etc. Plus many more 'op' states)

Variant.PNG

dthor
Active Participant

I like AQ's idea. But I have a question regarding it: would the PMIC would be set via a right-click on the loop (like changing the size of Array to Cluster) or would it be a new terminal, similar to the Loop Count terminal of a For Loop? In essence, I'm asking if the PMIC would be able to be set programmatically.

Lavezza
Active Participant

I don't think there should be one PMIC per loop (I think that is what dthor is assuming and maybe what AQ was suggesting). I can have loops of with more than one array shift register. One might grow to 20 (e.g. input devices), the other to 10,000 (channels).

 

I'm not a fan of right-click to set things unless there is some indication that things aren't default. Too many 'hidden' settings in LabVIEW already.

Darin.K
Trusted Enthusiast
I see this as a more natural extension of the DVR. You could specify the size when you acquire the reference, and then use all of the normal functions inside the In Place Element Structure. It also keeps the IPES as the primary way to help the compiler on occasion while hoping for the day that it is no longer needed.