08-05-2022 09:35 AM
Hi Community,
I have taken a look around in the existing topics, but didn't find an appropriate solution, so here goes my problem:
With a REST request I get back a json payload, that contains characters in German, e.g. "ä", "ü", "ö", when I pretty print it in Postman or Insomnia I get:
Looking at the raw data, that turns out to be:
"parentCustomerNameASC":"Bl\u00fctenhaus GmbH"
So, in short: is there a way to convert the json unicode coding of "\u00FC" to "ü", see link here .
As far as I can see this is a mixed problem of json and encoding. The only viable solution I found so far is to go through such an string manually, like suggested here.
I am using the JKI rest client and JDP's JSONtext libraries to obtain the initial string.
If anyone can suggest any (elegant) solution in LabVIEW/Python/.NET, that would be much appreciated.
Cheers,
Niko
Solved! Go to Solution.
08-05-2022 11:05 AM
My apologies. Seems being English I have neglected to implement the full unicode for the \uXXXX format. Just created an issue: https://bitbucket.org/drjdpowell/jsontext/issues/111/implement-full-unicode-in-uxxxx-format
08-05-2022 11:33 AM - edited 08-05-2022 11:34 AM
The built-in Unflatten from JSON nodes should support this case
08-05-2022 12:20 PM
Try this. It is same as the Tool Network version, plus this one fix.
08-05-2022 01:54 PM
@MilanR wrote:
The built-in Unflatten from JSON nodes should support this case
If your Windows local is set to German, or more precisely Western Europe codepage 1252, or Turkish codepage 1254. It wouldn't work with most other codepages.
08-08-2022 07:26 AM
There is really no need to apologize!
You provided an awesome toolkit, it is definitely not your responsibility to care for all the annoying characters in foreign languages 🙂
I installed your update and I am not sure where I make the mistake, but I am not getting the correct output.
The string that I am using as input is the following:
[{"@odata.etag":"W/\"Jn\"","status":"Released","no":"101010","description":"ECO","startingDate":"2021-03-31","startingTime":"08:00:00","endingDate":"2021-03-31","endingTime":"23:00:00","dueDate":"2021-04-01","quantity":2,"m365SalesOrderNo":"1596","parentCustomerNameASC":"Blütenhaus GmbH","m365SalesOrderLineNo":10000,"salesOrderNoASC":"","customerNameASC":"","salesShipmentPositionASC":0,"itemNoASC":"","priorityCodeASC":"","startMachineCenterASC":"","productionAreaASC":""}]
I attached the code that I use:
But the result still seems to be the same:
Thank you for your help in advance!
Cheers,
Niko
08-08-2022 08:16 AM
I can see in your screenshot that your JSON String is "Bl\00fctenhaus GmbH", so that is what that function returns. Perhaps you want to convert that to the ordinary sting 'Blütenhaus GmbH'? In that case you need to use a function that converts from JSON format to regular LabVIEW types, like "From JSON". It can get confusing understanding the difference between a JSON string and an ordinary string.
08-08-2022 08:40 AM
I don't know what happened in the post above, but the input string would contain:
"parentCustomerNameASC":"Bl\u00fctenhaus GmbH"
But if I get you correct, this is nor an an encoding problem, but actually a json string. This simply means I was looking at it from the wrong direction thinking it was a unicode/LabVIEW problem. But actually it is simply a json representation of an "ü"?
Do I understand you correctly that I simply implement a conversion function where this might occur?
If so, this is really embarrassing 🙈
Thanks a lot anyway for helping me simply understand the issue.
08-08-2022 08:57 AM
@Universaldilletant wrote:
I don't know what happened in the post above, but the input string would contain:
"parentCustomerNameASC":"Bl\u00fctenhaus GmbH"
But if I get you correct, this is nor an an encoding problem, but actually a json string. This simply means I was looking at it from the wrong direction thinking it was a unicode/LabVIEW problem. But actually it is simply a json representation of an "ü"?
Yes, non standard ASCII character are written as their Unicode number, thus \u and the hex value.
08-08-2022 09:16 AM
Encodings are confusing, especially when using a variety of software tools that vary in how much they support different encodings (and some of which may "helpfully" convert encoding silently). It is strange, for example, that your REST request returns '\u00fc' for that character, when it can more easily be represented in UTF-8 as the two-byte character 0xC3 0xBC. Normally, I would only expect the \uXXXX format to be used for control characters or \u0000, which is why i never implemented them fully before.