Removing a list of indices from an array

labuda · ‎08-29-2007

Wow, thanks for the detailed reply. I will definitely update my code.

I only have Labview 7 though, so i can't open the VI. And, i don't recognize some of those functions... can you save it in labview 7 please?

Thanks again.

Aleks.

altenbach · ‎08-29-2007

I no longer have LabVIEW 7.0 and you're right, the look of many icons has probably changed. I have labeled most functions in the following image, you should be able to recnstruct it. (the other case has the two wires just wired horizontally across).

On the right, after the loop, you can either user "reshape array" or "array subset". They're about the same.

As I said, the index comparison could be vasty improved, especially if you need to remove many elements. Another idea would be to move entire sections of contiguous subarrays that don't contain elements to be removed (this of course requires a memory allocation for the subarrays, which could be expensive too).

You should get the openG version also (Unfortunately, I don't have it at the moment) and look at the code. Some of those are very efficient.

Message Edited by altenbach on 08-29-2007 04:16 PM

LabVIEW Champion.

altenbach · ‎08-29-2007

@altenbach wrote:
You should get the openG version also (Unfortunately, I don't have it at the moment) and look at the code. Some of those are very efficient.

OK, I looked at the openG version and it looks slow. It simply calls "delete from array" in a loop using a shift register. This will be exponentially slower than my suggestion. For example if you delete N elements, the last element in the array needs to be moved N times! Maybe somebody wants to run some benchmarks.

LabVIEW Champion.

altenbach · ‎08-29-2007

Benchmarks!

(As I said, you need to only compare with the relevant index, comparing the entire index array at each loop iteration will be way to much work.)

I have implemented a draft version that seems to give the same result as the openG version for non-pathological inputs. I even added an output with an array of the deleted elements. My version assumes that the index array is sorted, 100% non negative, and contains no duplicate elements. (The openg Version also fails with duplicate or negative index elements!)

Each result has 3 numbers (remove 99% | remove 50% | remove 1% of all elements)

ArraySize OpenG Version My Version

============================================================

5000 8ms| 6ms| 1ms 0ms| 0ms| 0ms

50000 631ms| 540ms| 16ms 2ms| 1ms| 1ms

500000 59756ms|53073ms|1514ms 12ms| 15ms| 12ms

5000000 (not mesured) 130ms|161ms|120ms

As you can see, the differences are dramatic.

For example to remove 50% of all elements for 0.5M array, my version is way over 3000x faster. 🙂
My elapsed time goes with N (linear), the openG time goes roughly with N*N, really hurting at large arrays.
A 10x larger array is 10x slower for my version, but 100x slower for the openG version. (see point 2 above)
My version is rather insensitive to the number of elements to be removed while the openg version works best if only very few elements are removed.
The openG version is only useful if you need to remove only very few elements.

So, if you can guarantee that the array of indices is as required (sorted, nonegative, no duplicates), you should be able to handle your 500000 size array in under 20ms using something along the lines of my code. If you don't need the array of deleted elements, you can delete that part (it won't really make it faster, because that part is peanuts ;))

Here is a code picture, since you don't have 8.2. The "other" case is shown in the insert.

I am pretty sure things can be sped up a bit more with a few tweaks.

Message Edited by altenbach on 08-29-2007 06:54 PM

LabVIEW Champion.

Jim_Kring · ‎08-30-2007

Hi Christian,

Would you like to help us optimize the OpenG version? Your improvements and benchmarks look impressive.

Thanks,

-Jim

Let's talk about the future of LabVIEW...

Ben · ‎08-30-2007

The Grand Wizard strikes again!

Christian,

Could you try to explain to us the thought process you go through when you develop these elegant variations?

I'd love to be able to "think as fast as you".

Ben

Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel

Marc_A · ‎08-30-2007

I second what Ben said. I don't think I would ever approach this problem the way you did. My first thought was that going through each element and moving it could never be that efficient. I think you should take Jim up on the offer to put this code into OpenG. Maybe you should take a look at some of the other functions and pull some of this magic with them as well.

Jim_Kring · ‎08-30-2007

@marc A wrote:
I think you should take Jim up on the offer to put this code into OpenG. Maybe you should take a look at some of the other functions and pull some of this magic with them as well.

Ya, we could certainly benefit from a perfomance audit and some benchmarking.

Thanks,

-Jim

Let's talk about the future of LabVIEW...

altenbach · ‎08-30-2007

@Ben wrote:
Could you try to explain to us the thought process you go through when you develop these elegant variations?

I am pretty sure the code could be further improved. This was just a rough draft.

The secret words are "in place". The only really expensive thing are array resizing operations, so my religion forbids me from using these inside a loop. This makes "delete from array" and "insert into array" offlimits for use in inner loops of any serious code. After that, the rest just falls into place. 🙂

The basic algorithm to loop through the array skipping some elements, with a read index from [i] and a write index via shift register has been around for years in many forms (e.g. remove elements that are zero, NaN, exceed a certain limit, duplicate elements, etc. etc.). It's just a matter of re-using that same old template with a few tweaks.

To be clear about the posted code, it is not a finished produce and an openG implementation would need to be made bulletproof in order to handle malformed input. Especially the Index input needs to be cleaned up and that could be expensive. We would need to sort the array (O(NlogN)), remove all duplicates, and take the subset of positive indices before feeding it to my code above. This won't be fast if we have huge arrays and need to remove a large number of elements. Alternatively, we could just check if the input is OK (sorted, nonnegative, nonduplicate) and create an error otherwise (even that would probably slow it down by at least a factor of two).

The ideal subVI would need to implement several algorithms and intelligently select the right one depending on the inputs. In many cases, only very few elements (1-2) need to be removed from a huge array, and in this case the openG version is probably best.

Here's maybe a better idea:
Instead of having all these specialized tools (e.g. the huge number of polymorphic instances of the openG version), I usually prefer a more lowlevel, multipurpose tool. In this case we could probably get away with a single subVI that takes an I32 array size and an I32 array of indices to be removed and returns a sorted I32 array with all remaining indices. Now we can simply use a tiny FOR loop and generate the reduced array similar to the way I generate the array of removed elements above. Now it does not matter if we have an array of booleans, of CDB, or of a complicated custom cluster (that our polymorphic VI cannot handle anyway!), we can treat them all the same way. It's a bit more code than an atomic VI, but the palettes are already overloaded with way too many tools that nobody ever uses. Properly designed, such a subVI could handle malformed input much more gracefully and with less code, so maybe that would be the way to go for a universally useful tool. Again, in the limit of removing only one or two elements, it would still do too much work. ;).

I haven't written such a VI, so it would need to be written and some careful benchmarking would need to be done to verify my gut feelings. If it is competitive, It could even be made into an openG tool. 🙂

LabVIEW Champion.

altenbach · ‎08-30-2007

OK, the better idea actually works!

It is just a little bit slower, but handles all bad input automatically (unsorted, duplicates, negative indices, etc.). This is very important and would be much more expensive in the old version!

Here are the results as above with 3 numbers for each array size input (remove 99% | remove 50% | remove 1% of all elements)

ArraySize Turbo Version

=================================

5000 0ms| 0ms| 0ms

50000 1ms| 2ms| 1ms

500000 14ms| 18ms| 17ms

5000000 169ms|215ms|191ms

As you can see, the differences are dramatic. Here's a picture. Again, this is just a rough draft that needs to be cleaned up.

Message Edited by altenbach on 08-30-2007 09:25 AM

LabVIEW Champion.

LabVIEW

Removing a list of indices from an array

Re: Removing a list of indices from an array

Re: Removing a list of indices from an array

Re: Removing a list of indices from an array

Re: Removing a list of indices from an array

Re: Removing a list of indices from an array

Re: Removing a list of indices from an array

Re: Removing a list of indices from an array

Re: Removing a list of indices from an array

Re: Removing a list of indices from an array

Re: Removing a list of indices from an array