Labview Unicode, Boolean Text vs Boolean Caption

Dataflow_G · ‎08-23-2023

@rolfk wrote:

You are aware that the IsTextUnicode() API is badly broken since ever? It will return bad status with certain patterns of text..

Yeah, the context help for the encoding detection VI states it isn't perfect and will fail on certain byte orders. I'll probably replace it with a more heavy weight encoding detection like uchardet in a future version.

on VIPM, GitHub
on VIPM, GitHub
on VIPM, GitHub

Christ0phe · ‎08-23-2023

Thanks again.
Here are additional information. In the attached files one sees that the “č” character (\10D) creates a New Line. Actually without using the ‘Convert EOL’, the resulting string is well displayed. But since I have to pinpoint to the correct {line,column} element of the resulting table (convert String to Array of Strings detects too many \n) I miss the target 😉 The other possibility would be not to use the Array of Strings conversion but to count the \t…

Dataflow_G · ‎08-23-2023

You can specify a multi-byte string as the delimiter input to Spreadsheet String To Array. This example matches UTF-16LE encoding of "\r\n", and splits into a 1D array:

on VIPM, GitHub
on VIPM, GitHub
on VIPM, GitHub

Christ0phe · ‎08-23-2023

Yes I know that different delimiters can be configured but in the case that I presented above it is right in the middle of a string that some strange characters like “č” ‘breaks’ the string and generates a new line that should never exist. And whatever the delimiter used it creates a 2D String array that is unusable to retrieve specific {line,column} elements...

Dataflow_G · ‎08-23-2023

In this case the string will need to be split on the UTF-16LE line endings first (because Spreadsheet String To Array isn't Unicode aware), and then split again on the UTF-16LE tab delimiter.

on VIPM, GitHub
on VIPM, GitHub
on VIPM, GitHub

Christ0phe · ‎08-23-2023

Try to run your code with some “č” in your strings...

On my side the original problem is still there...

Dataflow_G · ‎08-23-2023

Adding some random "č" chars works properly, so not sure what's going on. Does running the attached VI with your unicode string work? Can you share the text file you're using?

on VIPM, GitHub
on VIPM, GitHub
on VIPM, GitHub

Christ0phe · ‎08-23-2023

Your example works well.

Mine does not.

I am attaching my Unicode file...

Dataflow_G · ‎08-23-2023

I just had a thought - you might be running into a Unicode issue specific to LV2022 Q3 similar to the one in this thread. I'll try out your text file in LV2022 Q3 or newer and report back.

on VIPM, GitHub
on VIPM, GitHub
on VIPM, GitHub

Christ0phe · ‎08-23-2023

In the meantime I have built the attached patch that seems to work. At least in my situation.

It shows that "element from 2D" does not work correctly whereas "element from 1D" does the job. In other words the "Reshape Array" function also creates an issue in the further management of Unicode...

LabVIEW

Labview Unicode, Boolean Text vs Boolean Caption

Re: Labview Unicode, Boolean Text vs Boolean Caption

Re: Labview Unicode, Boolean Text vs Boolean Caption

Re: Labview Unicode, Boolean Text vs Boolean Caption

Re: Labview Unicode, Boolean Text vs Boolean Caption

Re: Labview Unicode, Boolean Text vs Boolean Caption

Re: Labview Unicode, Boolean Text vs Boolean Caption

Re: Labview Unicode, Boolean Text vs Boolean Caption

Re: Labview Unicode, Boolean Text vs Boolean Caption

Re: Labview Unicode, Boolean Text vs Boolean Caption

Re: Labview Unicode, Boolean Text vs Boolean Caption