07-17-2012 08:04 PM
@Darin.K wrote:
@crossrulz wrote:
At the risk of sounding stupid, does anybody know the reason inserting into an array is faster than the build array? It seems to be like they should be performing the exact same function. I remember a discussion about it being faster, but I don't remember why.
.... The blue curve looks noisy, but the spikes are quite regular and a function of the row size, so they indicate calls to the memory manager for a new buffer. As you can see, it is quite greedy in requesting these. Insert into Array still does some operation proportional to the size of the array, but you can see the calls to the memory manager request a doubling each time,...
We noticed the same thing when examining it way back LV5 ?
The good news is the bad thing happens less and less frequently.
Over allocate an array in a buffer(Shift register), replace array subset, reshape array when done.
Ben
07-17-2012 11:09 PM
07-18-2012 03:14 AM
@Darin.K wrote:
[..]Coincidence? Or perhaps Build Array is effectively performing a copy each step of the way. yuk.[..]
This does not really surprise me. To be honest, i always was under the impression that Build Array creates a new copy each time instead of resizing the array.
I think this is the case, because that "is the easy approach", at least it was it a couple of years ago...ok, 10 or more years ago 😉
One thing i want to add here is that Build Array therefore is worse regarding "memory fragmentation" than other array handling operations. So it is the most prominent example for wasting memory chunks and running out of memory because of this....
Norbert
PS: I attached my version of VIs for the benchmark Darin is referring to. Please note that the option "copy" does introduce jitter and offset to the benchmark for Insert Into Array. It seems that if "DONT copy" is selected, this overhead (case structure) is neglegtable. But when Copy is selected, it surpases Build Array in the benchmark, at least on my machine. This is imho due to different memory management regarding copy and inplacing the shift register, since Build Array does the copy and adding in a single step. And again, the case structure adds a little.
Note that the benchmark repeats the algorithm two times each to get better results. Increasing this will improve the values further (constant "#Benchmarks to Average").
07-18-2012 06:20 AM
Is there a reason that Build Array would be a better choice than Insert? Otherwise I don't understand why Build Array doesn't just invoke Insert under the hood.
07-18-2012 07:56 AM
@Darin:
Interesting results!
@all the preallocate people:
Sorry, don't buy it. I'm sure that there will be times where I will want to optimize my code to the point where I am willing to preallocate an array, write more code to check the array each loop and enlarge it if necessary (especially in my while loops) and then more code to resize it in the end. Until then the good folks at NI have provided several one click methods of doing that automacally. The point is, one of the 3 (build vs. insert vs. autoindexing) is much less effcient. So for now build should be avoided (or preferably improved or deprecated).
This may not have been as much of an issue in the past but in this day and age with larger and larger data sets it will start to become an issue for a lot of people. For 15 years it was not an issue for me, but I am currently working on a project where the data processing, and thus GUI lag, changed from a 30 second to a 2 second delay by replacing builds with inserts. The former simply would not have been acceptable to my end users. A cautionary tale to say the least.
07-18-2012 10:37 AM - edited 07-18-2012 10:38 AM
@drjdpowell wrote:
Is there a reason that Build Array would be a better choice than Insert? Otherwise I don't understand why Build Array doesn't just invoke Insert under the hood.
Well, Build Array will pad a 2D array so if that is desirable then you should choose it. It is also more forgiving in terms of mismatched reference types and such.
And before you start running out and replacing your Build Array functions, remember that this is for appending rows to 2D only! For 1D appending the opposite holds. ![]()
Here is what it looks like when LV gets it right (and wrong). This is a 1D benchmark with concatenating inputs appending a fixed length array. The only slowdowns for BA are the buffer allocations, overall cost is order the fixed sized of the appended array, no dependence on the length of the array anymore. Insert is still O(n^2).
07-18-2012 10:58 AM
So what you are finding is that Build Array is better for 1D arrays and Insert Into Array is better for 2D arrays? Yeah, NI has something screwed up. But this is still good data have.
07-18-2012 11:44 AM
@crossrulz wrote:
So what you are finding is that Build Array is better for 1D arrays and Insert Into Array is better for 2D arrays? Yeah, NI has something screwed up. But this is still good data have.
My current understanding is that Insert into Array is consistently bad, Build Array goes from good to even worse. You hear anecdotes about the compiler team adding special optimizations for certain special situations, it sounds like that has happened to Build Array. In the general case it copies everything at the input into a new output buffer, in the special case it recognizes when it can leave the array alone while appending. Insert into Array seems to Always copy everything as well, only considerably faster.
In the 1D case BA is very fast for appending, slow for prepending (makes sense). Insert into Array takes the same time regardless of insert position, and this time is close to the BA time for prepending. Every case is the worst case for Insert, I guess there is a case for consistency.
The solution IMO is for Insert Into Array to recognize the special case of appending and optimize accordingly. It already recognizes this case since this is what it does when the indices are unwired.
07-18-2012 11:49 AM
@Darin.K wrote:
@crossrulz wrote:
So what you are finding is that Build Array is better for 1D arrays and Insert Into Array is better for 2D arrays? Yeah, NI has something screwed up. But this is still good data have.
My current understanding is that Insert into Array is consistently bad, Build Array goes from good to even worse. You hear anecdotes about the compiler team adding special optimizations for certain special situations, it sounds like that has happened to Build Array. In the general case it copies everything at the input into a new output buffer, in the special case it recognizes when it can leave the array alone while appending. Insert into Array seems to Always copy everything as well, only considerably faster.
In the 1D case BA is very fast for appending, slow for prepending (makes sense). Insert into Array takes the same time regardless of insert position, and this time is close to the BA time for prepending. Every case is the worst case for Insert, I guess there is a case for consistency.
The solution IMO is for Insert Into Array to recognize the special case of appending and optimize accordingly. It already recognizes this case since this is what it does when the indices are unwired.
Re: 1d BA
I think LV 6i was the version where NI optimized the BA with the noteworthy enhance being the output array is allocated based on the number of iterations of teh For loop.
Re: insert 2D
worse case situation would be to insert at 0,0 where everything has to be moved around since the stride changes.
Ben
07-18-2012 12:02 PM
@Ben wrote:
Re: 1d BA
I think LV 6i was the version where NI optimized the BA with the noteworthy enhance being the output array is allocated based on the number of iterations of teh For loop.
Re: insert 2D
worse case situation would be to insert at 0,0 where everything has to be moved around since the stride changes.
Ben
Nice catch on the output array being based on the number of iterations, I think I can make sense of the picket fences now. Doubling is clearly a much better strategy, but this works.
Prepending the first column in Insert 2D requires essentially the same time as appending a row, so like 1D every case is treated as the worst case.