Please provide us with some sample data that shows the "slowness". Slow is always relative and has different meaning for different people.
You do quite a few thing that could be more optimized.
Lets look at the four Subarray threads (mentally numbered 1-4, top to bottom).
Thread 1:Seems OK.
Thread 2:You slice out a 2D section, transpose it, then only retain one line. This seems inefficient and can be done in one step. It's just a 1xn subset of the original 2D array, right? Wire all inputs to the "Array Subset" node to get your final slice directly.
Thread 3:Seems OK.
Thread 4: Same problem as in "2" above. In addition: Not only can you get rid of the "select" in the FOR loop, you can get rid of the FOR loop entirely (see image). Also watch for coercion dots, you are comparing your DBL array with a SGL "5" which thus needs to be coerced to DBL. Right-click on the diagram constant and select "representation..DBL".
Tell us a little more about your application. It seems the two 2D arrays in your cluster possibly grow without bounds if "new data" is constantly appended to the current and position data. This will ultimately cause allocation issues. Does your application only get "slow" over time?