LabVIEW Idea Exchange

cancel
Showing results for 
Search instead for 
Did you mean: 
stbe

trim whitespace

Status: New

The current version of Trim Whitespace.vi uses regular expressions that are quite slow, but not needed since only simple search and a substring function is desired.

Therefore, I suggest to throw out the regex functions and replace them with G code looking for the same whitespaces (or even extend the selection to the openG variant).

I use the presented version within all my string processing functions, but many shipped VIs (especially the NI_LVConfig.lvlib) uses the trimming functions a lot. Since I do a lot of config files, this starts to be the bottleneck of the total LV code.

 

23008iC728CE59C53E9B3C

 

left-trim sub-VI:

23006iC444DD465E2506D4

 

right-trim sub-VI:

23010i7E67AC14F8028ECC

 

Performance metrics suggest speedup of about a factor of 15 for short strings and even more (>35) for longer strings.

_________________________
CLA
13 Comments
altenbach
Knight of NI

I don't understand your use of the in place structure. Since the data size can change, things cannot be done in place anyway if I understand this right.

 

If you simply remove the inplace structures in the subVI, we gain another 3x in speed! 😄

Intaris
Proven Zealot

That was my first thought exactly, the In-place structure has no purpose

Darin.K
Trusted Enthusiast

Discretion is often the better part of valor, so I agree (in principle) with this idea.  There are few bigger advocates for Regexes around here, but even I have rewritten this one to avoid them.  There is a lot of overhead involved passing the strings back and forth to the external library. 

 

My revision stays 'pink', I have found it to be smoking fast in my uses, certainly better than the Match Regular Expression flavor. 

 

23050i778CB370F1A2CEF2

 

23052i8964F3F02AC6D2AE

stbe
Active Participant

You're right. Even if a case structure would filter the possible zero-index case (and does no substring operation), a buffer allocation would take place. I gues I was so enthusiastic about the in-place structure that I use it more often than necessary ...

_________________________
CLA
altenbach
Knight of NI

First of all, your benchmark is completely flawed!

 

Reason: you are inlining the subVIs and are not actually using the outputs. This means that everything is considered dead code and stripped out by the compiler.

 

Why trim the whitespace if you're not using the output??!

 

If you would inline the stock "trim whitespace" subVI, your benchmark would also execute in 0ms for that same reason. Try it!

 

I guess your code is not completely stripped, maybe cause of some of the structures (?), but if you would change them e.g. as in this quick modification, they can be completely stripped, making them infinitely fast. The benchmark is fake!

 

altenbach
Knight of NI

You should also be careful with "loop invariant code". For example the compiler will see that each iteration of the inner FOR loop operates on the same data, and thus needs to be calculated only once.

 

In summary, the new compiler (details) especially with inlining and other optimizations poses another minefield when trying to do reliable benchmarking. 😄 

Darin.K
Trusted Enthusiast

Picking up my latest copy of the Instrumentation Newsletter shows me a couple of things.  First of all, Trim Whitespace is automatically inlined by LV10.  More importantly, it seems that they use Match Pattern and not Match Regular Expression as I implied from this Idea posting.  I may quibble with their limited idea of whitespace, but the implementation seems reasonable.  If I had benchmarked my 'revision' I would probably have gotten very similar results to the built-in version.

 

I am curious to see some meaningful benchmarks, perhaps I'll try.  It seems as the LV compiler gets more clever it gets harder to fool it.

 

OT: If someone else reads that IN article and can explain why the string control can be moved outside the For Loop I would be interested.

 

 

altenbach
Knight of NI

> Whitespace is automatically inlined by LV10

 

Actually, it is apparently not inlined in the released build. I filed a bug report.

altenbach
Knight of NI

> OT: If someone else reads that IN article and can explain why the string control can be moved outside the For Loop I would be interested.

 

In the article I quoted above is basically the same, but the string control is a diagram constant (!). I agree that the control cannot be moved outside the loop, especially if the subVI takes a long time to execute, giving the operator a chance to modify the control while the loop is running. Only if the subVI is fast, we can assume that the control remains invariant for the duration of the loop. Still the compiler cannot safely make that assumption and move it out.

 

I think the print article is wrong!

JackDunaway
Trusted Enthusiast

>> If someone else reads that IN article and can explain why the string control can be moved outside the For Loop I would be interested.

 

If it's the same example as the compiler improvements to which altenbach linked, it's due to Loop Invariant code. Loop Invariant code shows up as a fuzzy wires when Constant Folding is turned on in the IDE prefs.

 

***EDIT: altenbach beat me ***