06-01-2010 07:04 AM
(LabVIEW 2009)
Greetings fellow wireworkers,
I am quite experienced with LabVIEW, but relatively new with xml, so I'm not quite sure if there's an easy solution to my problem. Here's what I'm trying to do:
My code is receiving an XML-string and my purpose is to parse the string, obtaining information of all the parents and childs/siblings and reading their values.
I have been trying to utilize the "Load XML String.vi", which can be found under the vi.llb/xml -path. The XML-string I am receiving is UTF-8 -encoded and I guess this is basicly THE problem. LabVIEW doesn't support unicode. When I try to run my vi I receive the error from the mentioned vi - indicating "UTFDataFormatException".
If I change the encoding to some ascii-type, for example windows-1251 - the XML string is loaded with no problems and I am able to parse the string. The thing is that my vi really needs to be able to deal with any characters, including Russian, Japanese, Chinese etc.
I am wondering if there's any simple workaround if I want to use the Load XML String.vi, or should I just try parsing the XML-string myself, ignoring all those handy methods and properties the mentioned vi would provide me ?
Please find attached a small example vi and some pictures, showing the diagram and the front panel.
With best wishes,
Cerati
06-02-2010 03:45 AM
06-02-2010 04:23 AM
Thanks Mike, I will give it a try.
Cerati
06-02-2010 06:45 AM
Hello Cerati,
you may have an encoding problem: The character content of <ChildDataString> is no UTF-8.
UTF-8 is US-ASCII compatible, the first 7 bytes map to ASCII. The character data type of LabVIEW
however is 8 byte long: The range from 128...255 is used for locale-specific encoded characters.
UTF-8 doesn't use this range, it represent all characters not in range 0...127 by a multi byte encoding
using 2...4 bytes. There's a nice explanation in the wikipedia about that:
http://en.wikipedia.org/wiki/UTF-8
So first you have to make sure that your XML is really UTF-8 encoded, just setting encoding="UTF-8"
in the XML declaration is not enough, it only tells the parser what encoding it has to expect. In your case
the parser expects UTF-8 but gets byte sequences not allowed in UTF-8.Thus you get the message
Message:invalid byte 2 (å) of a 2-byte sequence.
candidus
06-02-2010 07:13 AM
Candidus,
I think you are right. I have indeed some "non-supported" characters in the ChildDataString. They are basically some leftovers from my previous tests. I guess I thought utf-8 would be able to handle almost any kind of characters, so I didn't pay much attention to those. I didn't realize That precisely was the source of the error message I received. Many thanks for pointing that out 🙂 I will continue my studies a bit wiser now.
Cheers,
Cerati
06-29-2010 03:01 AM
Hi Cerati,
I recently found a post in the forum that might be of interest:
LV has built-in functions to encode/decode UTF-8:
http://forums.ni.com/t5/LabVIEW/undocumented-function-quot-text-to-utf-8-quot/m-p/1156893
06-29-2010 04:36 AM
Candidus,
It's good that you linked that post here in case someone else is lost in unicode-jungle. I personally discovered that same mail earlier, when I was searching for more information.
As far as my original problem is concerned - I couldn't get any ready-made solution working satisfactory, so I just wrote my own xml-parser and it's working quite well for me. Anyway, appreciating a Lot all the help I received from here.
Cheers,
Cerati