1. Aaron Anbu Johan Abraham
  2. PowerBuilder
  3. Tuesday, 19 January 2021 19:28 PM UTC

Hi Team,

We are processing the flat file which has Swedish Char. While reading the file for validation Swedish Charis missing and replaced with some unknown Char. Please let me know how to overcome this issue?

PB version - Appeon PowerBuilder Standard Edition Version 2019 build 2170

 

Thanks,

Aaron

 

Accepted Answer
Arthur Hefti Accepted Answer Pending Moderation
  1. Wednesday, 20 January 2021 05:33 AM UTC
  2. PowerBuilder
  3. # Permalink

Hi

I assume the problem is that the file is saved as UTF-8 from Notepad without BOM (https://en.wikipedia.org/wiki/Byte_order_mark). When opening a file in PowerBuilder you can provide the encoding (Ansi is the default), however if PB thinks the encoding provided doesn't match with the encoding in the file, FileOpen returns -1.

So you can't open the UTG-8 encoded file in PB with EncodingUTF8! as long as there's no BOM. The simples way to solve your problem is saving the file in Notepad with Encoding "UTF-8 with BOM".

An other option is that you read the first 3 bytes of the file and check for the BOM. If it's missing you can add the BOM by adding 3 bytes with the BOM to the blob and save it to a new file that you can read.

Or you can try to scan through the text and try to convert the funny characters with Swedish characters.

Regards
Arthur

Comment
  1. Mark Goldsmith
  2. Wednesday, 20 January 2021 17:06 PM UTC
Hi John et al,



John I may have misunderstood your response so my apologies if I did but I think you mean "...If you know the flat file will ALWAYS be UTF-8 encoded with BOM, use that encoding in the FileOpen call." as that is the only way it will work, it fails otherwise, unless there is some additional coding I'm not aware of that makes it work.



Aaron, to add on to Athur's suggestion, should you go that route, the characters he is referring to required at the beginning of the file are hexadecimal "0xEF 0xBB 0xBF". You certainly could do so but it is some extra processing that you may prefer not to add to your code.



When I know I'm reading a file that is UTF-8 without BOM, what I have done is open the file as EncodingANSI! Then read the file into a blob variable, then convert it to a string using EncodingUTF8! This allows the accented characters to stay intact.



It would look like the following:



li_file_handle = FileOpen(ls_path + ls_file_name, StreamMode!, Read!, LockReadWrite!, Append!,

EncodingANSI!) //EncodingANSI! required for accents in a file that is UTF-8 WITHOUT BOM

li_chars_read = FileReadEx(li_file_handle,lb_blob)

ls_file = String(lb_blob, EncodingUTF8!)



I have tried a lot of different combinations of FileOpen, FileReadEx and String conversions and this is the only one that seems to work under this scenario...but maybe there are others.



HTH...regards,



Mark
  1. Helpful
  1. Aaron Anbu Johan Abraham
  2. Friday, 22 January 2021 17:36 PM UTC
Thanks Mark for you quick response. It helped me a to fix the issue and feel proud of this community
  1. Helpful
  1. Mark Goldsmith
  2. Monday, 25 January 2021 16:32 PM UTC
Great to hear Aaron, glad this worked for you.

Regards,

Mark
  1. Helpful
There are no comments made yet.
Aaron Anbu Johan Abraham Accepted Answer Pending Moderation
  1. Friday, 22 January 2021 17:37 PM UTC
  2. PowerBuilder
  3. # 1

Thanks Arthur for you quick response. It helped me a to fix the issue and feel proud of this community

Comment
There are no comments made yet.
Aaron Anbu Johan Abraham Accepted Answer Pending Moderation
  1. Wednesday, 20 January 2021 03:57 AM UTC
  2. PowerBuilder
  3. # 2

Hi John,

Thanks for your Time. Please find the below details

version/release/build - Appeon PowerBuilder 2019: Version 2019 build 2170

FileRead is being used to read flat file

Notepad is used for processing data and format is UTF-8

 

Thanks.

 

 

 

Comment
There are no comments made yet.
John Fauss Accepted Answer Pending Moderation
  1. Tuesday, 19 January 2021 19:55 PM UTC
  2. PowerBuilder
  3. # 3

Hi, Aaron -

What version/release/build of PB are you using?

Are you using FileOpen / FileReadEx / FileClose to "process" the flat file, or some other method. Please elaborate.

How is the flat file encoded? UTF-8? UTF-16?

Regards, John

Comment
There are no comments made yet.
  • Page :
  • 1


There are no replies made for this question yet.
However, you are not allowed to reply to this question.