03-24-2016 08:08 AM
Hello dear LabVIEWers..
From my Main VI, I need to pass a n dimensional array to n parallel instances of the same Sub-vi. Each instance will handle a single index of the array, do some logic, and return the output back to the main VI.
I know that I need to set them as reentrant in order to keep the give each istance an independent memory.
But I cannot use "Asynchronous call by reference", because this is what NI says: "You might want to call subVIs asynchronously when the calling VI does not require the results of the subVI immediately".
Any advice on a road to follow?
Thanks in advance
LoBa
03-24-2016 08:12 AM
Parallel FOR loops might just do the job for you.
03-24-2016 08:20 AM
doesnt parallelism in for loops depend on the amount of cores in your PC?
I might need 15 parallel loops, does that mean i need 15 cores?!
03-24-2016 08:25 AM
Do you really need 15 of them truely running in parallel? If all you are doing is analysis, then just set the parallelism to the number of cores you have and let the autoindexing do its thing. You will get the data back when all of them are done.
03-24-2016 08:48 AM
I assume you want to use parallelism to the extent possible. One thing you could try would be "unrolling the loop". For example, in place of a For Loop that says "Do the Rows one at a time" (which makes them do first Row 1, then Row 2, then Row 3, etc.), you could wire parallel routines, the first that says "Do Row 1", the second "Do Row 2", the third "Do Row 3". Here's an example (note that Process 1D Array needs to be reentrant, but can be directly called).
When you are working with large arrays, you don't want to be passing the large arrays around multiple routines -- you'll spend a lot of time copying entire arrays. Look up Data Value Reference -- it's a technique for passing pointers to the data so that you can operate on them "in place". As I haven't use this myself, I'm not going to give you an example ...
Bob Schor
03-24-2016 09:02 AM
@loba wrote:doesnt parallelism in for loops depend on the amount of cores in your PC?
I might need 15 parallel loops, does that mean i need 15 cores?!
You should spend some time with this question before going any further in your application.
The problem here isn't the for loop. It's the application design. No matter what implementation you use, you'll be restricted to the number of cores you have. If you have 4, there isn't any implementation in existence that can make those 4 cores run 5 tasks in true parallel. There will have to be some core sharing between at least some of the threads.
If you need this purely parallel computation to complete your task, you'll run into hardware limitations.
03-24-2016 09:35 AM
The others have suggested that this might not be the best solution.
However, here is a direct answer to your question.
Whether it's suitable or not is up to you.
I would make a FOR loop, for each of your 15 input values.
In each iteration, open a reference to your worker VI with the CALL AND COLLECT option (consult the help).
Launch it with START AYNCHRONOUS CALL.
Accumulate the REFERENCES you get in an array.
Then iterate over the array of references with a WAIT ON ASYNCHRONOUS CALL.
Collect the results from each call and dispose of the references.
This makes it APPEAR that you are running 15 instances in parallel.
It lets LabVIEW allocate the resources as it sees fit.
You get all the results back in the order you expect.
But of course, it cannot run truly in parallel.
You don't know in what order they will finish, because LabVIEW is free to do it however it feels like (the results will be in order, though).
Again, this is not necessarily the optimum solution for your issue, though. Just answering your question.
Blog for (mostly LabVIEW) programmers: Tips And Tricks
03-24-2016 07:32 PM
Steve (CoastalMaine),
Isn't your method similar to my "unrolling" method, the difference being whether the Helper routines are explicitly called in parallel (my code) or "spawned" (my made-up term for Start Asynchronous calls)? Seems to me that both of us end up with N Reentrant Helper Routines running "in parallel" (LabVIEW doing its best to give everyone cores and cycles), with the data being split and coming back together when everyone finishes, either explicitly (my code) or implicitly (the "Collect" part of Call and Collect), after the Wait is satisfied).
Is there likely to be a significant difference in run-time, core utilization, whatever between our methods?
Bob Schor
03-24-2016 07:38 PM
@Bob_Schor wrote:Is there likely to be a significant difference in run-time, core utilization, whatever between our methods?
There will probably be a slight difference from the Asynchronous Call needing to dynamically create the clones vs the explicit calling of them. But the parallel FOR loop will do the same as the Asynchronous Call, except it limits the number of instances that can run at a given moment in time to the number of cores on the machine.
03-25-2016 05:56 AM
If you unroll into batches of three, or four, or any given number, you are assuming something about the nature of the problem that doesn't apply to the general case.
If you unroll into batches of four, and problem #3 takes a lot longer than most, then you hold up the next batch (5-8) waiting for #3.
Whereas, if you spawn them all independently, then problems #4 - 15 can proceed, while #3 is grinding.
I generally follow the idea of giving LabVIEW itself as much freedom as possible to run the thing however it can.
Blog for (mostly LabVIEW) programmers: Tips And Tricks