"Threshold 1D Array" for descending data?

Kevin_Price · ‎05-02-2007

I need to search a very large data array (~30+ Meg) of descending values to find the index that first dips *below* a threshold. I know that "Threshold 1D Array" still doesn't support such a mode directly. I'm gonna be performing this type of search many times, and I hate to burn CPU on the workarounds I know, namely:

a) negate the array with the (-x) numeric operator and then perform the search. Just have to be careful with the fractional part of the resulting index.

b) reverse the array and then perform the search. I'm not really considering this one for such a large array.

I'm honestly surprised that "Threshold 1D Array" doesn't support a mode for descending data. Is there a better option than method (a) above? Sure seems like there ought to be, but I'm not seeing it...

Thanks for any help!

-Kevin P.

P.S. Background: searching position data to identify particular location(s) and range(s). Must then use results to extract correlated subsets of data from many other sensors. Sometimes motion is positive, letting me use "Threshold..." Other times motion is negative -- what then?

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

altenbach · ‎05-02-2007

Some other ideas (the problem is easy, because your array is sorted, right?):

You could code your own, doing a simple binary search. (http://en.wikipedia.org/wiki/Binary_search) or, depending on the shape of the data, something a bit more intelligent. This is probably the most efficient.
You could compare your array with the threshold using "less than", then search the resulting boolean array for the first occurence of TRUE using "search array".

LabVIEW Champion.

Kevin_Price · ‎05-02-2007

Thanks! I had a nagging feeling there was something I shoulda thought of myself, but, well, I didn't. With 30 Meg array size, it should only require about 25 iterations. Nice. (Yes, the data is sorted. Not by calling a "sort" function but implicitly, based on the how it was generated). I'd guess that the built-in "Threshold 1D Array" function might be implemented as a binary search already.

I know I've tinkered with doing a binary search before, I think based on a specific cluster element or something like that, but probably never made it into a library-ready sub-vi. Shouldn't be too big a deal though (he said a bit too casually...).

The second method sounds less promising for such a large dataset because:

a) I'd have to allocate a very short-lived 30 Meg Boolean array

b) "Search 1D Array" searches sequentially, right? I might need several orders of magnitude more iterations. Probably not a win no matter how fast each one is.

-Kevin P.

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

Jarrod_S. · ‎05-02-2007

You might also check out the Trigger Detection for 1 Channel VI, which basicly does what you want. (This VI might only come with the FDS or PDS.) The name of the VI on the palette is called Basic Level Trigger Detection in the Waveform Monitoring palette, which is a polymorphic Vi for the multi-channel and single channel instances.

This VI is somewhat designed for waveforms, but the guts of the algorithm only operate on arrays. You'd have two options to retro-fit it into your application. You could convert your array into a waveform temporarily, or you could make your own copy of this VI that strips out all the waveform parts of it.

One downside of this is that the VI is probably not optimized for monotonic data like Threshold 1D Array.

Message Edited by Jarrod S. on 05-02-2007 01:34 PM

Message Edited by Jarrod S. on 05-02-2007 01:35 PM

Jarrod S.
National Instruments

johnsold · ‎05-02-2007

Kevin,

I wrote an Improved Threshold Detector.vi several years ago. It generates output arrays showing the indices of the input array where the value in the input array crossed the threshold. Separate arrays for positive and negative transitions. It does not do anything fancy. It just loops through the array and compares to the previous values. When it finds a threshold crossing it replaces a -1 in the index array with the current index. This did not do any fractional index interpolation.

I tested it with arrays of random numbers produced by the "dice" and a threshold of 0.05. For an array size of 10e6 the average time was 51 ms. For 10e7 674 ms and for 3x10e7 1589 ms. At 10e8 disk swapping came into play and the times were more easily measured with a calendar. Testing in LV 8.2 on Mac Dual G5 at 2.3 GHz and 1 GB of RAM.

If you need interpolation you could use this to find the regions to interpolate and extract small segments around the threshold crossing for further calculations.

Lynn

Matt_W1 · ‎05-02-2007

A couple of things to be aware of.

1. Reversing the array doesn't necessarily need to create a copy (I'd guess LabVIEW's subarrays works in a manner similar to iterator adapters in C++ containers). So you could just reverse the array if your careful to avoid buffer allocations (use the show buffer allocations tool to check whether it's making a copy). In other words running threshold 1d array on a an increasing array is just as fast as on a reversed decreasing array (including the call to reverse the array), assuming you don't accidentally slip a buffer allocation in there. For instance using a case statement to reverse the array if it's the wrong way, or just pass it straight through will create a copy. But if your case statement either runs Threshold 1D Array or Reverses the array and then runs Threshold 1D Array, then it won't need to copy the array.

2. Threshold 1D array for some reason uses a linear search instead of a binary search (I'm not sure why either, but I tested it and it scales linearly, on 32e6 integers it take 84 ms, and on 16e6 integers 42 ms, a binary search takes less than 1 ms). So you'll want to use a binary search anyway (if you have duplicate values in your data be sure to check how your binary search handles them).

Message Edited by Matt W on 05-02-2007 04:48 PM

Message Edited by Matt W on 05-02-2007 04:49 PM

altenbach · ‎05-02-2007

@matt W wrote:

...Threshold 1D array for some reason uses a linear search instead of a binary search (I'm not sure why either, but I tested it and it scales linearly, on 32e6 integers it take 84 ms, and on 16e6 integers 42 ms, a binary search takes less than 1 ms). ...

In real life, you cannot assume that the array is sorted, so if you have multiple crossings, a binary search would blow up.

We also have a input for the start index, so you can search for multiple crossings in a loop. I typically have data with multiple zero crossings, but I start searching from the array-min index to get the position I really want.

LabVIEW Champion.

Matt_W1 · ‎05-02-2007

The description say it works on a 2d non-descending graph, so by that description (unless I'm misinterpreting it,of which there's a good chance) there's either 1 crossing point or a continuous group of points equal to that crossing point (and in that case I'd find one and do a backwards search for the first). Also it's in the array functions which implies to me that it's not for measured data (which was a bad assumption on my part). I guess the start index should have given away that it's linear though. You can assume the array is ordered if you created it (for a data structure for instance), just ran sort array on it or if it's a digital readout from a clock (Note: most of my time is spent on writing UI's to setup and run decently complicated test+post processing jobs, so most of the data I work with isn't noisy). I guess I'm just surprised that there's no binary search built into LabVIEW. I tend to treat LabVIEW like a traditional programming language instead of a data collection and analysis language, so I've been surprised quite a few times.

parthabe · ‎05-03-2007

Lynn,

Can you pl repost the VI saved for LV 7.1 ?

- Partha ( CLD until Oct 2027 🙂 )

Kevin_Price · ‎05-03-2007

I prototyped a "Threshold..." function based on a binary search. It has worked for a variety of inputs, but I haven't kicked the wheels enough to consider it robust yet. Still, it's quite a bit faster than the built-in function on my big datasets. Plus it automatically handles either ascending or descending data. So, thanks again to altenbach for the suggestion!   Other comments:

Jarrod & Lynn (johnsold😞 I've tagged the thread so I can try those alternatives out for situations with smaller or less well-ordered datasets. For my special circumstances, I just can't beat the ~1 msec binary search on 10 Meg arrays (my home machine couldn't handle the 30 Meg array without disk thrashing).

Matt: interesting benchmarking you did to see that the built-in "Threshold..." execution time scales linearly with array size.

Jarrod (and perhaps also Matt): I typically avoid using the waveform datatype even for measured data. This is partly based on a little factoid I've been carrying around in my head for years that may not even be true. I've figured that the waveform bundle is basically a special-purpose cluster, and it's been my impression / belief that clusters containing arrays often cannot be processed as efficiently as naked arrays. I kinda figure that the more times I make LabVIEW guess about whether or not to copy a buffer, the more chance there is for it to err on the side of caution rather than efficiency.   I had to ingrain several efficiency habits to deal with RT a few years back and have tried to carry many of them over into standard Windows programming.
    Another issue I have with waveforms is that the t0 and dt fields are often not helpful to me, and can be misleading. (I understand that my case is not typical -- the waveform is definitely handy for many users). t0 is just a low-precision system timestamp, taken at task start, whether or not sampling actually begins then. I often generate my own sampling clock with a counter to give me more control over stopping and starting it, even changing the rate on the fly. I may start an AI task several seconds before I start its sample clock. So the t0 and dt fields of an externally-clocked AI waveform are going to be suspect at best. Thus, why bother?
   Anyway, my question mainly concerns efficiency. From a standpoint of memory usage, "in-placeness", etc., am I right or wrong to suspect that waveforms are less efficient than arrays? Consider an array processing example. I'm concerned that having to unbundle Y from a waveform, pass Y into some nested sub-vis, then bundle the output back into a waveform is more prone to making buffer copies. I tried looking at a couple examples with "Show Buffer Allocations" on just now, but even for sub-vi's that *definitely* change the array size internally, no buffer allocations were shown on the top-level diagram. So I'm not sure I trust "Show Buffer Allocations."

Thanks again for the response. I hope to be able to post the binary search version once I validate it more thoroughly.

-Kevin P.

P.S. As I'm beginning to depend on TDMS files, I've recently learned of a potential big advantage to waveform storage. Waveforms can contain sets of attributes (probably much like variant attributes?) that are internally part of the waveform datatype when written / read back. This looks very handy and may get me to (finally) use waveforms more often..

Message Edited by Kevin Price on 05-03-2007 07:48 AM

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

LabVIEW

"Threshold 1D Array" for descending data?

"Threshold 1D Array" for descending data?

Re: "Threshold 1D Array" for descending data?

Re: "Threshold 1D Array" for descending data?

Re: "Threshold 1D Array" for descending data?

Re: "Threshold 1D Array" for descending data?

Re: "Threshold 1D Array" for descending data?

Re: "Threshold 1D Array" for descending data?

Re: "Threshold 1D Array" for descending data?

Re: "Threshold 1D Array" for descending data?

Re: "Threshold 1D Array" for descending data?