 Blokk
		
			Blokk
		
		
		
		
		
		
		
		
	
			01-08-2018 10:07 AM - edited 01-08-2018 10:12 AM
Hello!
Since I have some time, I decided to clean up/revise some of my VIs in some of my projects. I use an action engine to store sensor data (columnwise) along with absolute time stamps. The data is generated with 1Hz, and I only need to store 2 days of data (172 800 values per channel). My code works as intended, and using a simple banchmarking, a data insertion + Graph curve read out operation together takes roughly 60 msec. This is just fine for me, because I only call this AE at two parallel loops: at one place I insert the new sensor values, and at the other I read out selected curves (max 12) between the user specified time stamps. Both loops iterate with 1 Hz.
The AE is called: "Data_storage_AE.vi". The VI where I decimate (average) the data curves to be less than the max horizontal pixel size of my Graph (I do not want to plot more points than the number of pixels): "decimate _for_curves.vi".
The code works well, but I would like to fix the Rube-Goldberg parts, and also to gain some speed if possible. I have some changes in my mind, and I searched such topics in the forum, but I thought the best is to show you people the VIs (I zipped up the relevant VIs/Ctrls from my project, version LabVIEW 2017) 🙂
I am looking for optimization mainly at these locations:
Thanks for any advise, I am willing to learn! 🙂
Solved! Go to Solution.
 Ben
		
			Ben
		
		
		 
		
		
		
		
		
	
			01-09-2018 08:06 AM - edited 01-09-2018 08:10 AM
Hmmmm... 50 views and no replies. Maybe you are asking questions that are too hard to answer.  
While I am the author of the Action Engine Nugget, they are not the only hammer in my toolbox.
I do not have time to chase down the answer to your questions because that would require repeated benchmarking of various options. Those could make for nice "mini-questions" if you really want to dig into each of those, break down the code to the minimum and benchmark variations and then possibly ask for others to look over your shoulder.
But I will make some general comments.
1) explicit decimation in LV code is something that I have not bee driven to do in a long time. The code behind the charts is pretty smart and I have yet to see a situation where it failed to render peaks a troughs regardless of how silly the number of data points are. Is it worth the trouble? You tell me.
2) Collections of queues make for great memory buffers. I have used them to buffer a lot of data and avoided the contiguous memory requirement for arrays. Since data transfers via queues can occur "in-place" they are wicked fast.
3) Another idea that I have been driven to develop (yet) is harnessing the speed of the Variant attribute look-ups. Now to just blather off hand... Imagine casting ( as Christian has recently illustrated) a time stamp to a string and using that as the variant name where the value of the variant is a queue ref. The time stamp would act a fast index to locate the queue ref that can be used to access the data that is associated with the timestamp.
4) at first glance I see nothing in your code that rules out setting the VI as "sub-routine" priority. Sure you will lose the ability to debug but it could run faster.
That is all that comes to mind before the coffee starts to work.
Have fun and think about taking notes and documenting your adventure in "performance-land". Quoting the daughter of an old girls friend "Mommy I like to go fast!"
Ben
01-09-2018 09:25 AM
@Ben wrote:
Hmmmm... 50 views and no replies. Maybe you are asking questions that are too hard to answer.
While I am the author of the Action Engine Nugget, they are not the only hammer in my toolbox.
I do not have time to chase down the answer to your questions because that would require repeated benchmarking of various options. Those could make for nice "mini-questions" if you really want to dig into each of those, break down the code to the minimum and benchmark variations and then possibly ask for others to look over your shoulder.
But I will make some general comments.
1) explicit decimation in LV code is something that I have not bee driven to do in a long time. The code behind the charts is pretty smart and I have yet to see a situation where it failed to render peaks a troughs regardless of how silly the number of data points are. Is it worth the trouble? You tell me.
I use Graphs, but I guess it is not relevant whether Charts or Graphs we use for this question. I always thought that throwing large data arrays to a Graph will slow it down. I need to update the Graph with 1Hz, so I still imagine, for example 172800 X 12 channels data points could slow down the GUI? That is the reason I decimate my data. I will test this behaviour later!
2) Collections of queues make for great memory buffers. I have used them to buffer a lot of data and avoided the contiguous memory requirement for arrays. Since data transfers via queues can occur "in-place" they are wicked fast.
I like the idea using Queues, I will make a variation with Lossy Queues (instead of the shift registers), and test it.
3) Another idea that I have been driven to develop (yet) is harnessing the speed of the Variant attribute look-ups. Now to just blather off hand... Imagine casting ( as Christian has recently illustrated) a time stamp to a string and using that as the variant name where the value of the variant is a queue ref. The time stamp would act a fast index to locate the queue ref that can be used to access the data that is associated with the timestamp.
This is a bit too "exotic" for me at this level 🙂
4) at first glance I see nothing in your code that rules out setting the VI as "sub-routine" priority. Sure you will lose the ability to debug but it could run faster.
I inlined the decimation subVI, i do not see too much improvement, but I am a noob in benchmarking 🙂
That is all that comes to mind before the coffee starts to work.
Have fun and think about taking notes and documenting your adventure in "performance-land". Quoting the daughter of an old girls friend "Mommy I like to go fast!"
Ben
Thanks Ben for your thoughts! I also do not see difference in speed either using parallelized FOR loop or not in the decimation subVI. I guess the LV compiler is already smart enough, to do optimization behind the curtain...
What I will test, what happens if I redraw 48 hours of data, 12 channels at 1 Hz in a Graph. I imagine it will slow down things...?
 Ben
		
			Ben
		
		
		 
		
		
		
		
		
	
			01-09-2018 09:41 AM
"Sometimes, when attempting to speed up LabVIEW you may want to start by slowing it down." (Ben  )
)
If you are not seeing much of a difference it may be that LV can what you are asking just fine. So to learn how to make faster, start by beating it up using larger data sets and more frequent updates.
If you are throwing large data sets at a graph you can use the "deferFPUpdat" property to defer the screen updates, update the indicator and then undefer the GUI update. but please note that under some situations, the defer.FPUpdat may actually make it worse.
Ben
 Kevin_Price
		
			Kevin_Price
		
		
		
		
		
		
		
		
	
			01-09-2018 11:13 AM
Also speaking only very generally. These are thoughts and ideas, not fully vetted to the point where I'd call them "advice."
- while agreeing with Ben that the "in-placeness" of data transfer that queues provide can be amazingly helpful, it *sometimes* comes with a cost. Here, if you use a lossy queue for your circular buffer, you'd be adding 1D arrays as individual elements for that queue. Later when you want to retrieve a time-delimited subset of data, it's gonna get kinda awkward. You'll copy out the entire queue contents which will be a 1D array of clusters containing 1D arrays of data. You'll probably end up copying that data into yet another data structure format (such as 2D array of data) before returning it to the caller.
The awkwardness of partial retrieval gets worse when you have variable sized packets being added to the queue. However, I don't honestly know whether the "awkwardness" of the code pays off in some other way.
- I'm kinda inclined to think you're better off with your internal 2D array, but do your own code to make it a circular buffer. I *think* the following tips will help:
-Kevin P
 Ben
		
			Ben
		
		
		 
		
		
		
		
		
	
			01-09-2018 11:39 AM
I used to use round-robbin buffers I wrote myself but I stopped after the polymorphic queues was introduced since it was much faster.
Part of my suggestion for multiple queues was to be able to easily share the data and also easily toss old stuff. Just flush the queue with the old time stamp and use it for fresh stuff.
Just my 2 cents,
Ben
01-09-2018 11:59 AM - edited 01-09-2018 12:19 PM
For my actual application it is just fine, but I really like that I learn some new things via even a "not necessary" improvement 🙂
I also found this interesting discussion: https://lavag.org/topic/16931-queues-vs-ring-buffers/?page=2
And I will play with this data ref approach: https://forums.ni.com/t5/Example-Program-Drafts/New-Variant-Ring-Buffer/ta-p/3513140
For the second link, note that, it looks like an older version exists in the VIPM, but as I see the only difference is that there are polymorphic VIs under the unpublished VIP version...
The problem here as I see, that this Addon only accepts 1D (variant) arrays. But I could just use an array of such data refs, so all sensors (data columns) and the time stamps (one column) would live in an individual data ref buffer. This might be even more efficient than using a 2D array: I only operate (decimate, etc) on those channels (1D ring buffers now) which the user selects from the 39 channels.
Hmm, I will put together some test tomorrow....
edit: so i will start something like this:
01-09-2018 02:41 PM
Of course that double array data type is wrong at the top left FOR loop. It must be a single double constant.
 drjdpowell
		
			drjdpowell
		
		
		 
		
		
		
		
		
	
			01-09-2018 04:25 PM
Kevin_Price wrote:- I'm kinda inclined to think you're better off with your internal 2D array, but do your own code to make it a circular buffer. I *think* the following tips will help:
- preallocate the 2D array at the full buffer size. Fill it with values like 0 or NaN, but the preallocation is the most important part.
- keep a separate counter that keeps track of total # samples added to the buffer. This lets you know when the buffer isn't yet full and lets you use modulus functions for indexing purposes.
- make use of Replace Array Subset when adding new data.
- Your time search functions may get trickier since data can now "wrap around"
- When returning a range of data that includes the wraparound, you can use Build Array in "Concatenate" mode to join the 2D subset at the end of the buffer and the 2D subset at the beginning
What Kevin suggests is the best way to relatively easily improve your existing code, though were you to be starting from scratch I would suggest using SQLite.
01-10-2018 08:56 AM
Thanks all for the useful hints! I play with my code, and try to enhance it 🙂
By the way, I did a stupid thing in that decimation VI (see my first post, last snippet): I calculate the decimated array from the Time Stamps array N times inside the FOR loop! Stupid me! 🙂 It is enough to calculate it once outside the FOR loop, since I use the same time stamp array for all XY curves 🙂