Swedish Char issue

Resolved Swedish Char issue

How-to

Votes

Undo

Aaron Anbu Johan Abraham
PowerBuilder
Tuesday, 19 January 2021 19:28 PM UTC

Hi Team,

We are processing the flat file which has Swedish Char. While reading the file for validation Swedish Charis missing and replaced with some unknown Char. Please let me know how to overcome this issue?

PB version - Appeon PowerBuilder Standard Edition Version 2019 build 2170

Thanks,

Aaron

Accepted Answer

Arthur Hefti Accepted Answer Pending Moderation

Wednesday, 20 January 2021 05:33 AM UTC
PowerBuilder
# Permalink

I assume the problem is that the file is saved as UTF-8 from Notepad without BOM (https://en.wikipedia.org/wiki/Byte_order_mark). When opening a file in PowerBuilder you can provide the encoding (Ansi is the default), however if PB thinks the encoding provided doesn't match with the encoding in the file, FileOpen returns -1.

So you can't open the UTG-8 encoded file in PB with EncodingUTF8! as long as there's no BOM. The simples way to solve your problem is saving the file in Notepad with Encoding "UTF-8 with BOM".

An other option is that you read the first 3 bytes of the file and check for the BOM. If it's missing you can add the BOM by adding 3 bytes with the BOM to the blob and save it to a new file that you can read.

Or you can try to scan through the text and try to convert the funny characters with Swedish characters.

Regards
Arthur

Comment

Load more comments

Mark Goldsmith
Wednesday, 20 January 2021 17:06 PM UTC

Hi John et al,

John I may have misunderstood your response so my apologies if I did but I think you mean "...If you know the flat file will ALWAYS be UTF-8 encoded with BOM, use that encoding in the FileOpen call." as that is the only way it will work, it fails otherwise, unless there is some additional coding I'm not aware of that makes it work.

Aaron, to add on to Athur's suggestion, should you go that route, the characters he is referring to required at the beginning of the file are hexadecimal "0xEF 0xBB 0xBF". You certainly could do so but it is some extra processing that you may prefer not to add to your code.

When I know I'm reading a file that is UTF-8 without BOM, what I have done is open the file as EncodingANSI! Then read the file into a blob variable, then convert it to a string using EncodingUTF8! This allows the accented characters to stay intact.

It would look like the following:

li_file_handle = FileOpen(ls_path + ls_file_name, StreamMode!, Read!, LockReadWrite!, Append!,

EncodingANSI!) //EncodingANSI! required for accents in a file that is UTF-8 WITHOUT BOM

li_chars_read = FileReadEx(li_file_handle,lb_blob)

ls_file = String(lb_blob, EncodingUTF8!)

I have tried a lot of different combinations of FileOpen, FileReadEx and String conversions and this is the only one that seems to work under this scenario...but maybe there are others.

HTH...regards,

Mark

Helpful 0

Aaron Anbu Johan Abraham
Friday, 22 January 2021 17:36 PM UTC

Thanks Mark for you quick response. It helped me a to fix the issue and feel proud of this community

Helpful 0

Mark Goldsmith
Monday, 25 January 2021 16:32 PM UTC

Great to hear Aaron, glad this worked for you.

Regards,

Mark

Helpful 0

There are no comments made yet.

Responses (4)

Aaron Anbu Johan Abraham Accepted Answer Pending Moderation

Friday, 22 January 2021 17:37 PM UTC
PowerBuilder
# 1

Thanks Arthur for you quick response. It helped me a to fix the issue and feel proud of this community

Comment

There are no comments made yet.

Aaron Anbu Johan Abraham Accepted Answer Pending Moderation

Wednesday, 20 January 2021 03:57 AM UTC
PowerBuilder
# 2

Hi John,

Thanks for your Time. Please find the below details

version/release/build - Appeon PowerBuilder 2019: Version 2019 build 2170

FileRead is being used to read flat file

Notepad is used for processing data and format is UTF-8

Thanks.

Comment

There are no comments made yet.

John Fauss Accepted Answer Pending Moderation

Tuesday, 19 January 2021 19:55 PM UTC
PowerBuilder
# 3

Hi, Aaron -

What version/release/build of PB are you using?

Are you using FileOpen / FileReadEx / FileClose to "process" the flat file, or some other method. Please elaborate.

How is the flat file encoded? UTF-8? UTF-16?

Regards, John

Comment

There are no comments made yet.

Page :
1

There are no replies made for this question yet.
However, you are not allowed to reply to this question.

Please login to post a reply

You will need to be logged in to be able to post a reply. Login using the form on the right or register an account if you are new here. Register Here »

Forgot Password?

Resolved Swedish Char issue

Find Questions by Tag