04-09-2009 03:49 PM
I'm currently working on a project where we are trying to create a simulation of a biofuel plant. I am trying to look in to any problems which we may encounter in the future. There are several steps in the process of a biofuel plant. Here are two main issues of the project which may cause problems in the future.
1.) Each element is represented as a cluster of characteristics and each element represents one 1kg of some sort of mixture
2.) Each step in the overall process is represented using a while loop with a timed loop inside
One of the major issues I could see in the future with number one is using up too much memory. The biofuel plant we are basing our simulation off processes around 1 million kg of corn a day. This does not include any other inputs. I ran a simple for loop which enqueued a corn mixture element in to a queue to see the max we could have before running out of memory. The number came out to be about 2.5 million elements. Does anyone know a better way to represent each element? I'm not a 100% how long they will want to be running this simulation, but this max could be met in half a day most likely. Also, is there anyway we could clear memory being used by the queue and output the final product as it comes so it will hopefully not reach this max? I attempted to dequeue each element, write it to a file, and keep going, but this did not seem to free up any space.
For issue number two, I am worried about each loop getting a fair amount of processing time. Each loop is not too process intensive, but the more steps then the more while loops that must run. They want it to be as "real-time" as possible, and this was the only solution we could think of. I wasn't sure how LabVIEW schedules multiple while loops, but I have a quad core processor and at least 10-15 while loops and they all seem to run. I am looking for any other possible implementation ideas which may clean up the code or be more efficient.
This simulation is going to be intended to be used as a learning tool and give the students an opportunity to control various parts of a simulated biofuel plant. Any suggestions are appreciated.
Thanks,
Kevin
04-09-2009 11:23 PM
Hi Kevin,
You can always flush the queue and then have another loop write the array of clusters into a file, while simultaneously continuing your simulation. The way queues in LabVIEW work is that when you dequeue an element, the queue will still hold on to that memory and use it the next time the queue needs it. Only when any other thread requests for more memory will LabVIEW transfer the unused memory to that thread. I hope that makes sense to you.
You can always have your timed loops run on a particular core with a particular priority. This will make you, the developer, responsible for properly scheduling the threads since there is always a chance that you might starve one of them.
Overall, your architecture seems fine. Just be sure to use the Producer/Consumer design pattern where possible to take care of race conditions and synchronization issues (You can find a template by navigating to File -> New... and then find the Design Pattern directory).
04-10-2009 08:25 AM
Kevin,
I cannot fully visualize your data structure. Rather than making each element represent 1 kg, can the element contain a numeric value which represents the mass of the corn in the element? This will probably require some changes in the simulation algorithm, but it would allow large amounts of material with much smaller data structures. A memory limit at 2.5 million elements seems low. How much data is in each element? LV does require contiguous memory for arrays, although I am not sure how that applies to arrays of clusters, especially if the cluster has variable size components like strings or arrays.
Nested loops can be problematic if you are not careful to make sure that the termination logic works correctly. Parallel loops will be routed to separate threads and different processors as much as possible by LV's internal scheduler. One thing which is required to assure processor resource sharing is that each loop must have a wait, even 0 ms.
I agree with Adnan Z that a Producer/Consumer architecture or other state machine architecture will likely be the way to keep your program robust and adaptable to the inevitable changes.
Checking Buffer Allocations (Tools >> Profile >> Show Buffer Allocations... in recent versions of LV) can be helpful in optimizing memory use. There is a white paper on Managing Large Datasets in LV on the NI web site. You should check that .
Lynn
04-13-2009 03:37 PM
Changing the size of each element of corn or mixture is a possibility, but my user is wanting to stay away from that right now.
Each element contains about 20 different characteristics of the corn, mixture, water or whatever it is suppose to be represented. I have not been able to get above 2.5 million elements when simply enqueue elements in to a queue. Each element is represented by a single cluster and I have a queue of individual clusters. The way I tried to find the max amount of elements may not have been the correct. I simply just had a while loop that would enqueue a dummy corn cluster in to a single queue until I hit the memory error. If this is not a correct example to attempt to push my queues to their limit let me know. Also, if this isn't clear I can attach the VI I used.
I am still playing around with the nested loop and am thinking it may be better to have one large while loop to execute the simulation and have separate timed loops for each process. This will give us more flexibility in the control and timing of each process if that becomes a problem down the road.
I have played around with the producer/consumer architecture a little bit and am keeping that in mind when programming everything.
Thanks for your help,
Kevin
04-14-2009 01:41 PM
Please post your VI. Put some typical data into some of the clusters and Make Current values Default, then save. That way we can see some of your data.
Lynn
04-14-2009 01:55 PM
04-14-2009 03:06 PM
Kevin,
I got to about 9.5 million iterations before it crashed.
That data structure will create problems. The cluster you are enqueuing has three variable length datatypes inside: the array, the string inside the cluster in the array, and the Unit of 1 element string. The queue is probably the best way of handling large numbers of these because each queue element can be allocated separate memory space (if I understood correctly what I have read about how LV allocates memory). An array of those clusters would require contiguous memory.
I would probably take a close look at the simulation algorithm to see if it could not be modified to make better use of the way memory is allocated.
Lynn
04-14-2009 03:34 PM
04-15-2009 08:06 AM
KKnowles wrote:
I have updated the test VI to remove the array and strings you listed. I don't believe they will be used anywhere and all the characteristics could be described as doubles. I run it again and only get 2.4-2.5 million elements still. The machine I am working on has 2GB of ram. Where are these elements stored and what would be causing the limit?
I change the q data to be an array of your clusters and re-ran the test queuing up 1000 at a time and got 14555 iterations so the the total was 14M.
All of teh data has to fit into memory and Windows (by default) is limited to only 2G of memory.
Please remind me why all of this has to be in memory at one time?
Ben
04-15-2009 10:03 AM