1. ma jack
  2. PowerBuilder
  3. Wednesday, 13 December 2023 14:44 PM UTC

Hello everyone!

I failed when using string(blob(lb_arr), EncodingUTF8!) to convert the byte array of "?" symbol into characters.

The following is my code

// UTF-8 Encoding 0xF0 0x9F 0x89 0x82
byte lb_arr[] = {240,159,137,130}
string ls_1 = string(blob(lb_arr), EncodingUTF8!)
Messagebox("",ls_1)

The characters displayed are not the characters I want. I found in https://www.compart.com/en/unicode/U+1F242 that the UTF-16 encoding of this symbol is 0xD83C 0xDE42, so I modified my code , this code can display characters normally.

// UTF-16 Encoding 0xD83C 0xDE42
byte lb_arr[] = {216,60,222,66}
string ls_1 = string(blob(lb_arr), EncodingUTF16BE!)
Messagebox("",ls_1)

Why does the conversion fail when using EncodingUTF8?  Is there any encoding range limit for UTF8 in powerbuilder?
I checked this document https://docs.appeon.com/pb2022/application_techniques/Using_Unicode.html and found no range information about UTF8.
The PB version I am using is PB2019R3 2781.

Chris Pollach @Appeon Accepted Answer Pending Moderation
  1. Wednesday, 13 December 2023 23:03 PM UTC
  2. PowerBuilder
  3. # 1

Hi Ma;

  Did you try looking at the "ls_1" variable before the MessageBox() command in the debugger?

  My guess is that the value is correct but the MB() command is expecting Unicode and thus does not display the characters properly.

Regards ... Chris

 

 

Comment
  1. Chris Pollach @Appeon
  2. Thursday, 14 December 2023 02:27 AM UTC
That makes sense as UTF-8 uses 1-3 bytes to represent various characters. Once above x'FF' (255), the UTF-8 format uses bytes 2&3 to represent the "Extended" character set. So once you have 4+ hex characters, then I can see the UTF-8 interpretation falling apart.
  1. Helpful
  1. ma jack
  2. Thursday, 14 December 2023 02:56 AM UTC
Thank you, Chris, i probably know the reason. UTF8 in PB only implements the characters in Unicode BMP https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane. This part of the characters only requires 3 bytes.
  1. Helpful
  1. Chris Pollach @Appeon
  2. Thursday, 14 December 2023 15:25 PM UTC
Correct
  1. Helpful
There are no comments made yet.
  • Page :
  • 1


There are no replies made for this question yet.
However, you are not allowed to reply to this question.