Replace array subset in parallel for loop without increasing memory

altenbach · ‎04-13-2018

@tshurtz wrote:

altenbach, the real data is not an array of doubles, ...

Sorry, your post did not show up until now. Do you have some typical inputs for testing?

Do you have a link describing the algorithm?

Easiest would be if you could call your VI with typical default data, stop the VI and make all control value default. Save under a new name and attach it here.

LabVIEW Champion.

mcduff · ‎04-13-2018

@Altenbach

Not sure if this is valid, probably not.

On a single Xeon CPU, 4 Cores, no hyperthreading available.

Your parallel code is twice as fast as AnotherTry, according to the DETT. However, look at all the stuff the code is doing. AnotherTry is using a small amount of computer resources, where the Parallel code is using 10x the amount of resources, more thread switches, memory allocations, etc. Here more code means less CPU,

Depending on the rig the program is running, and what else is going on in that computer, it may be better to wait a bit longer and not crash the CPU.

DETT Anothertry

DETT TestParallel

mcduff

altenbach · ‎04-13-2018

Yes, it is natural for a parallel for loop to use more memory, because it will create N parallel instances of the code. Have you looked at my same code without parallelism to see how things compare? Also note that the compiler might smartly reuse already allocated memory when resizing, so the penalty might be close to zero, no matter what dett is telling you.

(Sorry, posting by phone, cannot look at code or test.)

In any case, the op has posted code that seems quite different, so we better focus on that. From a first glance earlier I saw a lot of array dicing and slicing. Definitely room for improvements.

LabVIEW Champion.

tshurtz · ‎04-16-2018

Sorry for the delay.

Here is the code with some data embedded.

Let me know what you think.

It takes 235 seconds to run on my machine.

Thanks again.

Ok it is having trouble with the large file I think (42 mb).

How do i upload it somewhere?

tshurtz · ‎04-16-2018

I cant seem to get it to upload my file. I have tried zipped or unzipped (both 42 mb).

Any suggestions?

mcduff · ‎04-16-2018

I am trying to figure out what your code is doing. Do you have a small dataset to try it with.

mcduff

tshurtz · ‎04-17-2018

OK, I have modified the file to generate a comparable amount of random data so the file is much smaller. Hopefully it will upload ok.

It is currently taking ~380 seconds to run the important part of the code. I am guessing it is longer b/c I am on battery on my laptop and it is throttling back my CPU maybe? Or maybe the random data is just harder to chunk through but I cant see why. Also I am not strongly tied to anything in this algorithm. I made it up. It is very simplistic. Ultimately I need something with a good solution that runs fast and doesn't use too much additional ram. The main measure of how well it is running is the "elapsed time" and "reduction". Anything comparable to 0.033 or lower for reduction is good and time needs to go way down from the high 300s.

to clarify the code's objective, I have many point pairs that are inseparably connected. but the order of those point pairs in a layer doesn't mater but I want to greatly reduce the length between the second point in the pair and the successive first point. Again a good solution is needed not necessarily the optimal solution. My current algorithm is take the second point of the first point pair and find the closest first point in all the remaining in the layer. Then I go to the second point of the new pair and repeat until I run out of point pairs. Also the first/second order of each point pair must remain the same. Maybe I should be trying some sort of traveling salesman approach but I don't think complexity is going to make enough improvement to warrant it.

Thanks again for your help

tshurtz · ‎04-17-2018

Any ideas based on the test data?

Kevin_Price · ‎04-17-2018

I gave it a little look and tried just a couple things. I put my first suspicions on "Delete From Array" and slightly tweaked the algorithm to get rid of it. Surprisingly, I didn't produce big gains that way.

The only thing I did with any notable effect was to rework the outer For loop so that Parallelization would be allowed. The main thing needed was to remove the shift register and Replace Array Subset. I let the output accumulate on a concatenating tunnel instead. This let the For loop be parallelized very well because the outer loop iterates over independent layers.

I've got 4 real cores, 8 virtual cores with hyperthreading. Setting up parallelism at 4 gave me an almost 4x improvement. Setting it for 6 or 7 cores maxed me out at just over 5x improvement.

Overall, that's not very notable. Bigger gains might require a different kind of algorithm and perhaps different data structures.

Thinking of each point pair as a vector, you seem to want to order them in a "vector chain" sequence that sort-of minimizes the gaps from the end of one vector to the start of the next. Your algorithm is a brute force approach that amounts to standing at the end of your vector chain, looking out and measuring the distance to all the other unconnected vectors in the present layer, then appending the nearest neighbor to the end of the vector chain. Repeat until all connected.

This algorithm will scale with N-squared, and you won't know how well you did compared to the optimal solution. The same algorithm applied to the travelling salesman problem would repeatedly move on to the nearest unvisited city.

I'd venture that studying some of the traveling salesman literature might give some insights about ways that various brute-force algorithms can go awry. Maybe there's even some good ideas out there. Unfortunately, the general case for a truly optimal solution is considered computationally "hard" and it seems to me that you really do have a variety of travelling salesman problem here.

-Kevin P

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

tshurtz · ‎04-17-2018

Thanks Kevin,

Great way of describing it. I jut need to find a quick running traveling salesman algorithm that returns an ok solution with lots of points and doesn't eat up too much ram.

LabVIEW

Replace array subset in parallel for loop without increasing memory

Re: Replace array subset in parallel for loop without increasing memory

Re: Replace array subset in parallel for loop without increasing memory

Re: Replace array subset in parallel for loop without increasing memory

Re: Replace array subset in parallel for loop without increasing memory

Re: Replace array subset in parallel for loop without increasing memory

Re: Replace array subset in parallel for loop without increasing memory

Re: Replace array subset in parallel for loop without increasing memory

Re: Replace array subset in parallel for loop without increasing memory

Re: Replace array subset in parallel for loop without increasing memory

Re: Replace array subset in parallel for loop without increasing memory