"Building" large arrays using functional global

Phamton · ‎01-08-2008

I have an application which analyzes data and returns a "results" cluster at 10-100 Hz. This program could be running from minutes to days and I'd like to have access to all of the results clusters meaning if I store them in an array it could have something on the order of a million elements. I've written a functional global subvi that preallocates an array for the results and resizes it when the array becomes filled (doubling the size each time). This avoids both having to allocate a huge array to begin with and using the "build array" vi too often. This works but still doesn't seem that fast to me. In my actual application I have clusters with on the order of 100 elements and even when my array is only a couple thousand elements big the profiler is showing that this vi is taking a few milliseconds each time. I've attached a sample version of the subvi and its block diagram below. Are there things I can do to make this faster? Thanks,

Paul

Message Edited by Phamton on 01-08-2008 01:36 PM

BEHajder · ‎01-08-2008

Hello Paul, one easy mod is to physically move all constants in your diagram that are inside the while loop to outside the while loop, and of course connect them back to necessary inputs inside the loop. I think that's a speed up hint I recall from past discussions.

Good Luck - Brian

Jarrod_S. · ‎01-08-2008

First of all, nice work on the functional global.

You might try separating your Get and Set operations. Currently, everytime you Set an item into storage, you also Get a copy of the whole storage. I'd guess it's this extra copy that's hurting performance.

If you are setting multiple packets, but only getting the results for analysis every once in a while, this should have a big effect on performance. Check out the two pics below. Notice that in the Set case, you're only outputting an empty array, and not making a complete copy of your internal data.

Message Edited by Jarrod S. on 01-08-2008 02:02 PM

Jarrod S.
National Instruments

Phamton · ‎01-08-2008

Hi Brian,
Thanks for the suggestion. In my actual code, I use typedef'd clusters and instead of having clusters constants I have a subvi that returns the default cluster constant (so I don't get gigantic cluster constants on my block diagram each time I modify the typedef). Since this default cluster subvi is only called when I resize my array I didn't want it outside the loop thus causing it to be called each time. Plus, I can see in the profiler that my default cluster subvi is taking no time so I don't think this is a problem.

Phamton · ‎01-08-2008

Jarrod,
Ah!! Originally I had done this but couldn't get around the fact that I had to have an output for all cases. I figured I might as well output the whole array for the "set" case which led me to get rid of separate "get" and "set" cases. I'll try your version. Thanks!

johnsold · ‎01-08-2008

The cluster is all numeric. Would a 2-D array be faster than an array of clusters? I have not tried it, but it seems that the overhead of a 2-D array of numerics might be less than that of an array of clusters of numerics.

At 100 Hz I calculate that you will have ~51 million DBL values per day. At 8 bytes per DBL, you have almost 415 MB of data. This is in addition to what the OS and LV need to run. Regardless of whether you do the allocation yourself or let LV do it for you, you will begin to problems as you near the end of the first day or early into the second. Make sure you are saving it to disk so that you do not lose data if memory gets full.

You did not indicate what kind of analysis you are doing on the data, but given the size of your data set, it is almost certain that you will need to analyze it in smaller sections. Perhaps you can do that on the fly an only keep the results of the analysis (much smaller amount of data, hopefully) in memory/

Lynn

Phamton · ‎01-08-2008

Lynn,
Thanks for your comments. I'd like to avoid going to a 2-D array if possible. In my actual program I have a typedef'd cluster with a couple different types of data. I am aware of the memory constraints; in practice for longer runs my acquisition rate will be closer to 10 Hz. For now I'm just working out the kinks with thousands of elements; if I can get to the point where memory constraints are my issue I'd be satisified! I am streaming the results to disk in parallel as well.
The results cluster I mention is taken from fitting the data to a model (on the fly) which currently takes ~5 ms per acqusition using externally called C code. At the very minimum I want to be able to display a chart showing the history of 10 or so fit parameters, their uncertainties, and a few other parameters over the course of a run so I can't really condense the data much farther unless I read the history on demand from disk.

tst · ‎01-08-2008

Another option you might wish to examine is using a queue, since the memory management might be better, even if it's dynamic, but I'm not sure if that will help. You can preview the queue to get the data. You can also do a partial combination, e.g. use a queue and then empty it into the global after a while.

The point Brian brought up is partially accurate - what he was thinking about was controls. If the value of the control can not change, reading it once is more efficient. Additionally, in some cases, if the control is used as an input or an output, LabVIEW can use a pointer if the control or indicator are at the root of the diagram. This is useful for large arrays.

___________________
Try to take over the world!

Ben · ‎01-08-2008

Couple of thoughts...

When I last analyzed the time involved in building an array (LV 5.1?), I can to the conclusion that LV started out with about 1K by default. As long as the buffer stayed below 1K, I did not see delays trying to build. If I continued to build I saw another hit at 2K, 4K, etc. So.... I think all of your work is just duplicating what the build array in a loop does.

Another approach (remember this is coming from Ben, Mr AE) is just use multiple queues. One for the GUI, one for logging, one for the analysis. The GUI and Logging queues you keep reading as normal. THe analysis queue just gets read once at analysis time.

Another thought...

How about using the in-place operations. They let you work exclusively in-place.

Yet another thought....

How about pre-allocating 5X what you need? THen just take the sub-set when it comes time to read.

Final idea:

Queues will often out perform AE since they can work in-place.

Ben

Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel

Phamton · ‎01-08-2008

First of all- Jarrod's suggestion to separate the "get" and "set" functions helped enormously. Now instead of 5-6 ms per call when I have thousands of elements, it's always <0.1 ms. So the extra copy was killing me.

However, the comments from tst and Ben now have me intriqued. First a couple questions/comments for Ben:

"I think all of your work is just duplicating what the build array in a loop does."
    I've seen on the forum numerous warnings about using Build Array in a loop. If Labview is being somewhat smart and doubling the allocation as the array grows why is Build Array so bad? Are even the few reallocations that time consuming?

"How about using the in-place operations"
    What do you mean by in-place operations?

"How about pre-allocating 5X what you need?"
    I'm trying to avoid this since the length of time the program runs varies greatly and isn't necessarily known. If I'm running for a few minutes it seems a waste to allocate enough for a few days. I imagine for long runs, I'll be closing every other program I can to free up memory, whereas for short runs, I might not want to have to do this.

tst and Ben -

I actually am using queues throughout my program, although I'd only pictured them as pipelines to move data from one place to another rather than as dynamic storage. A version of the vi I posted above is actually placed in a loop that polls a queue coming from an analysis vi. If I do just use the queue instead, is there a way to treat it like an array? For example, the simplest thing I'll want to do, is to be able to select a parameter in the cluster (during the run) and display the values in a chart.   Is there a fast way to look at every value in the queue without dequeing?

LabVIEW

"Building" large arrays using functional global

"Building" large arrays using functional global

Re: "Building" large arrays using functional global

Re: "Building" large arrays using functional global

Re: "Building" large arrays using functional global

Re: "Building" large arrays using functional global

Re: "Building" large arrays using functional global

Re: "Building" large arrays using functional global

Re: "Building" large arrays using functional global

Re: "Building" large arrays using functional global

Re: "Building" large arrays using functional global