04-09-2015 01:58 PM - edited 04-09-2015 01:59 PM
That's unintuitive. Why would riffling dbls with the for loop be faster than riffling in the DLL? The only difference is the number of samples (30 instead of 40).
04-09-2015 04:00 PM
@Oligarlicky wrote:
That's unintuitive. Why would riffling dbls with the for loop be faster than riffling in the DLL? The only difference is the number of samples (30 instead of 40).
Sorry, I don't understand what you are trying to say. There is always a dll used.
You always need to riffle 40 samples, in the I32 version, you are processing 160bytes wile in the DBL version you are processing 320bytes for the riffle operation. If the loop is set for 30 iterations, we are trimming the output automatically.