06-16-2017 12:46 PM
OpenG has a more performant version of the Trim Whitespace using byte arrays instead. I think it also does the forward and backward searches in parallel.
06-16-2017 12:49 PM
@crossrulz wrote:
OpenG has a more performant version of the Trim Whitespace using byte arrays instead. I think it also does the forward and backward searches in parallel.
Can someone CCT that?
06-16-2017 12:53 PM
Well, Jeff's solution is nice and looks very quick. However I use this as part of a lot of communication protocols to cleanup response from user or remote inputs so the strings are short. It would be odd to trim white space on mega-character inputs but I can see the memory foot print being a problem.
Other VIs in the utility branch are all subroutine priority and usually reentrant. It seems that this should as well. (I think there is an old post by AQ claiming authorship....). There are probably faster ways to do it, and Jeff's solution is good until we get to the megacharacter string case when requires a duplicate array to be reversed which doubles the storage. But for 10 character arrays there isn't much timing advantage either way.
Note the openG version is like Jeff's and has the string reversal but compares against a boolean array of allowed characters and then makes a sub-string. Instead of creating a reversed string, the compare should index backwards.
Should NI have made the basic utility reentrant? Should it be a CAR for the next release.
06-16-2017 01:07 PM - edited 06-16-2017 01:08 PM
@JÞB wrote:
@crossrulz wrote:
OpenG has a more performant version of the Trim Whitespace using byte arrays instead. I think it also does the forward and backward searches in parallel.
Can someone CCT that?
Do note that the VI is set to be Preallocated Clone Reentrant and Subroutine priority.
06-16-2017 01:08 PM
@sth wrote:
Should NI have made the basic utility reentrant? Should it be a CAR for the next release.
Yes, Trim String.vi I attached is limited to an I32 size string. It would never work on the whole text of War and Peace (But there are better ways to deal with that)
That being said, without R&D chiming in on the under workings of Match Pattern, I would not just inline Trim Whitespace.vi. Although it appears to be a good candidate for inlining, there may be real reasons to not inline it
06-16-2017 01:26 PM - edited 06-16-2017 01:38 PM
@crossrulz wrote:
@JÞB wrote:
@crossrulz wrote:
OpenG has a more performant version of the Trim Whitespace using byte arrays instead. I think it also does the forward and backward searches in parallel.
Can someone CCT that?
Do note that the VI is set to be Preallocated Clone Reentrant and Subroutine priority.
Well, That's just SILLY
Why are we expanding a U8 to an I32 and adding a 0 to the resulting index? The only time it is 1 is if the whole string is whitespace and then it doesn't matter if we return Index or Index +1 we get an empty string either way
And go ahead
Look at the output of this:
They re-wrote is whitespace!
06-16-2017 01:29 PM
@JÞB wrote:
Yes, Trim String.vi I attached is limited to an I32 size string. It would never work on the whole text of War and Peace
I32 can handle 2.1 billion characters.
While it depends on the translation you use, War and Peace has at most around 600,000 words. Even if the entire text is just the longest word in English repeated over and over again (pneumonoultramicroscopicsilicovolcanoconiosis, 45 characters) plus punctuation and spaces (max of 5 characters per word), that's just 30 million characters. You could fit that in an I32 71 times and still have over 15 million characters left over.
06-16-2017 01:41 PM
@Kyle97330 wrote:
@JÞB wrote:
Yes, Trim String.vi I attached is limited to an I32 size string. It would never work on the whole text of War and Peace
I32 can handle 2.1 billion characters.
While it depends on the translation you use, War and Peace has at most around 600,000 words. Even if the entire text is just the longest word in English repeated over and over again (pneumonoultramicroscopicsilicovolcanoconiosis, 45 characters) plus punctuation and spaces (max of 5 characters per word), that's just 30 million characters. You could fit that in an I32 71 times and still have over 15 million characters left over.
What have you been breathing
06-16-2017 01:41 PM
So maybe the question is why hasn't trim whitespace.vi been rewritten to be optimized AND re-entrant! The openG version is a good place to start but should probably sequentially search for each whitespace character from the front and then from the back. I think the reverse array is a big hit in time and memory for large arrays.
It is just that each search loop should test each character by the is white space and stop if false.
All subroutine level calls must be preallocated memory to be re-entrant. The issue is should it really be subroutine priority? That may also block normal operation.
(Note: at 600K words or approximate 6M characters give or take, War and Peace will easily fit in a I32 length string)
https://indefeasible.wordpress.com/2008/05/03/great-novels-and-word-count/
I think we need to worry more if we are doing DNA string searching/manipulation....
06-16-2017 01:59 PM
@JÞB wrote:
@Kyle97330 wrote:
@JÞB wrote:
Yes, Trim String.vi I attached is limited to an I32 size string. It would never work on the whole text of War and Peace
I32 can handle 2.1 billion characters.
While it depends on the translation you use, War and Peace has at most around 600,000 words. Even if the entire text is just the longest word in English repeated over and over again (pneumonoultramicroscopicsilicovolcanoconiosis, 45 characters) plus punctuation and spaces (max of 5 characters per word), that's just 30 million characters. You could fit that in an I32 71 times and still have over 15 million characters left over.
What have you been breathing
I can tell you that I haven't been breathing ultra-microscopic particles of silica volcanic dust, because that would cause me to contract pneumonoultramicroscopicsilicovolcanoconiosis.