I don't quite understand why you need to get one point at the time and use software timing. Can't you grab an array of successive data from the device with each read operation?
OK, your VIs are full of problems that casue performance issues.
Timing.vi
You are autoindexing at three output tunnels, creating arrays of potentially infinite size with mostly useless information. LabVIEW does NOT know the final size, so several expensive memory reallocations will occur. Then you only look at the first and last point, respectively! Turn off autoindexing at the "time" tunnels and save the first time in a shift register (Or better grab it right before the loop starts).
You run two loops (regular and timed) in parallel. I would only test one at a time.
TestingForum.vi
You don't initialize the shift registers on the very left, thus the arrays will grow for each run of the subVI. Simply delete the shift registers and use an autoindexing output tunnel.
One of the slowest ways to read/write values is via a property nodes. Get rid of them all, they are not needed. Get rid of that useless stacked sequence (it just makes the code hard to debug) and line up your subVIs horizontally.
Don't update indicators in the loop unless you need to and use the correct representation. Currently the "array 2" indicator is DBL while the data is SGL causing coercion. Place a small wait inside the loop to make the rate predictable.