06-21-2012 04:38 PM - edited 06-21-2012 04:40 PM
Deleted: ya'll are fast!
06-22-2012 02:37 PM
I don't have LV11 and do not have time to rewrite so how about some random unchecked hypotheses here:
(1) Based on typical performance on my quad core (hyperthreaded to 8 CPU) I would expect Taki1999's version to become comparable if not the fastest with parellelism enabled on the For Loops. No shift registers so it is easily parallelizable.
(2) I doubt the IPES gains anything over the simple Replace Element primitive, but I seem to recall that IR&C is happier inside the For Loop so you are not allocating the full Boolean Array. Could shave a msec or three in this case, probably more so for larger arrays.
(3) Ben64's solution is tidy, but does have a few extra buffer allocations so not much to speed up there.
My guess:
Parallelism will significantly close the gap between Taki1999's version and Darren's. With 8 cores it may even be slightly faster, maybe not. My guess for the real winner is Darren's version with IR&C located inside the For Loop. Ben64 wins tidyness award, but probably lags the other versions. All guesses.