10-24-2017 06:46 PM
I have a 1D array of strings that I wrote a simple VI to search through. Of course with straight-forward methods, I have to spell the search string exactly the same way as the element I'd like it to return. I was contemplating ways around this and realized people must have needed to do this before. Has anyone out there come up with an efficient way of searching an array of strings that accounts for user errors such as spelling mistakes and just returns the closest match?
10-24-2017 07:06 PM
How do you define closest match? Least amount of matching characters? Closest on a QWERTY keyboard? At what point do you say the word is not in your list of words, rather than just spelled wrong?
Why is the user typing in a word that may or may not be in your list? You could put all the strings into a combo box and let the user select the exact string available.
10-25-2017 08:06 AM
if you'd like to read up on the topic
https://en.wikipedia.org/wiki/Approximate_string_matching
but i would go with gregryj and define the selection beforehand.
don't trust user input 😉 they are the worst
10-25-2017 11:09 AM - edited 10-25-2017 11:10 AM
Back in the mid seventies in a pascal course (?) we did something similar to the following.
I don't remember the details. Probably would need some research.... 😄
10-25-2017 11:34 AM - edited 10-25-2017 11:34 AM
So I have some pretty terrible code I don't mind sharing that I wrong a while ago with zero research. It is a function that takes two strings and tells you how similar they are. I used it for finding an artist that matched a google search closely and then if the score was above some value it would just use the new artist name otherwise it would prompt if the new name was the right one or the old one.
Written in the 8.x era and appears to care about capitalization which it probably shouldn't. Also there isn't much documentation so good luck.
Unofficial Forum Rules and Guidelines
Get going with G! - LabVIEW Wiki.
17 Part Blog on Automotive CAN bus. - Hooovahh - LabVIEW Overlord
10-25-2017 01:07 PM
I’ve not used it, but you look into the Spelfix1 extension to SQLite.
10-25-2017 02:15 PM - edited 10-25-2017 02:26 PM
@Hooovahh wrote:
So I have some pretty terrible code ...
Just for kicks, here is a quick and rough draft of a near literal translation that uses more modern tools (lexical class, swap values, conditional tunnels, etc) and is mostly blue....
Seems to give the same results if you convert all to lowercase in your inputs. Not sure how sound the algorithm is 😄
10-25-2017 03:11 PM - edited 10-25-2017 03:13 PM
Thank you and kudo'd. Yeah this technique mostly is about a letter, following an expected letter. So if my name is Brian, and someone spells it Brain, it will find the 'r' following 'b', and that's it, 'a' shouldn't come after 'r', 'i' shouldn't come after 'a', and 'n' shouldn't come after 'i'. This would result in a pretty poor score of 0.3 and all that you have is one letter swapped. Now that I look at it this is a pretty bad way of doing things and someone should come up with a better technique. Maybe add the number of each letter to be seen? Or maybe look at the letters that are wrong and see if they are close to the correct key on the keyboard? Then you might also need a list of commonly misspelled words.
Now that I'm talking this through this feels more like a cryptography problem with brute force and rainbow tables.
Unofficial Forum Rules and Guidelines
Get going with G! - LabVIEW Wiki.
17 Part Blog on Automotive CAN bus. - Hooovahh - LabVIEW Overlord
10-25-2017 03:18 PM - edited 10-25-2017 03:20 PM
Oh wow lots of things on the internet. One I found on LAVA is an attempt at implementing the Levenshtein Distance. The lower the number the more similar two strings are.
https://lavag.org/files/file/64-strings-levenshtein-distance/
Here's some other stuff in one dimensional programming:
http://php.net/manual/en/function.similar-text.php
https://stackoverflow.com/questions/26446348/checking-if-one-string-is-similar-to-another
https://stackoverflow.com/questions/17388213/find-the-similarity-percent-between-two-strings
Unofficial Forum Rules and Guidelines
Get going with G! - LabVIEW Wiki.
17 Part Blog on Automotive CAN bus. - Hooovahh - LabVIEW Overlord
10-25-2017 04:58 PM
You might also obtain some insights Googling "spell checking algorithm". I would think that the algorithm for listing suggestions is the same kind of thing.