Not able to convert encoding to UTF8 in fileopen command in Powerbuilder 2017R3 b1858.

View Replies (3)

Resolved Not able to convert encoding to UTF8 in fileopen command in Powerbuilder 2017R3 b1858.

How-to

Votes

Undo

Subramanyam Kalivarapu
PowerBuilder
Monday, 5 August 2019 18:52 PM UTC

Hello All

I have used the below syntax which is not converting the encoding standard to UTF8! from ANSI!.

fileopen(as_filename,LineMode!, Write!,LockReadWrite!,Replace!,EncodingUTF8!)

The above statement is returning -1 if i'm passing EncodingUTF8! and returning 1 if the last argument in above statement is defined as EncodingANSI!.

My requirement is to convert file from ANSI! to UTF8!.

Please suggest.

Thanks

Subramanyam K.

Responses (3)

Chris Pollach @Appeon Accepted Answer Pending Moderation

Thursday, 15 August 2024 00:33 AM UTC
PowerBuilder
# 1

Hi Daniel,;

Because HTML files have extended escape characters, you can not use Line Mode. Instead, open the file in Stream Mode and as John correctly states, do not specify the Encoding argument. HTH

Regards ... Chris

Comment

Load more comments

Benjamin Gaesslein
Thursday, 15 August 2024 12:29 PM UTC

The problem is that Powerbuilder always! expects a byte-order-mark when opening a file with EncodingUTF8! and simply returns -1 when it is not found. The Unicode standard does not require or recommend UTF-8 files to have a BOM so many programs will not write it to the file. UTF-8 files with no BOM are perfectly valid but PB cannot open them without going through some hoops. To read a no-bom-UTF-8 file into a properly encoded UTF-8 string, you have to do a little dance:

- open the file as ANSI in Streammode: filenum = FileOpen(filename, StreamMode!, Read!, LockWrite!, Replace!, EncodingANSI!)

- Read the contents into a blob: FileReadEx(filenum, blobvariable)

- convert the blob variable into a string: utf8string = String( blobvariable, EncodingUTF8! )

Helpful 4

Daniel Seguin
Thursday, 15 August 2024 19:52 PM UTC

Thanks a lot Benjamin! Works great.

With this solution my é stayed as a é in the string variable.

Then I just had to call my function to replace extended letters in html-iso-8859 format.

Helpful 0

Benjamin Gaesslein
Monday, 19 August 2024 06:53 AM UTC

Glad I could help! This approach in reverse is also the only way to have a PB app create a UTF-8 text file without BOM. Which was my original goal that lead me to figure this out a while ago.

Helpful 0

There are no comments made yet.

Daniel Seguin Accepted Answer Pending Moderation

Thursday, 15 August 2024 00:07 AM UTC
PowerBuilder
# 2

Hello,

I am having the same issue as explained above

ls_tempfile = trim(ls_TempDir) + "\" + trim(as_rte_mode) + ".htm"
rte_autrs.savedocument( ls_tempfile, FileTypeHTML! )

// recupere le contenue du fichier html qu'on vient de creer en memoire
li_fnum = FileOpen(ls_tempfile, LineMode!, Write!, LockReadWrite!, Replace!, EncodingUTF8!)
li_linestatus = FileRead(li_fnum, ls_line)
do while  li_linestatus <> -100
	ls_html = ls_html + trim(ls_line) + ls_cr
	li_linestatus = FileRead(li_fnum, ls_line)
loop
FileClose(li_fnum)
FileDelete(ls_tempfile)

When I run this in the debugger, the li_fnum = -1

And the html file generated is in utf8. There is a é in Séguin which shows up correctly in the file.

When I use this line to open the file, li_fnum = FileOpen(ls_tempfile, LineMode!, Read!), I get li_fnum = 1 but I get the é is transformed into a different value in the string variable ls_line

Bonjour salut HOLA Daniel SÃ©guin

<?xml version="1.0" encoding="UTF-8" ?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta content="TX31_HTM 31.0.1103.500" name="GENERATOR" />
<title></title>
</head>
<body style="font-family:'Arial';font-size:12pt;text-align:left;">
<p lang="en-US" style="text-indent:0pt;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:10pt;">Bonjour salut </span><span style="font-size:10pt;font-style:italic;">HOLA</span><span style="font-size:10pt;"> Daniel Séguin</span></p>
</body>
</html>

Question:

Does this mean that if I am opening an html file, even though the file appears to be utf8

I cannot use encoding parameter

therefore I cannot read the proper utf8 character

Comment

There are no comments made yet.

John Fauss Accepted Answer Pending Moderation

Tuesday, 6 August 2019 00:48 AM UTC
PowerBuilder
# 3

I think the issue is that you are expecting the file encoding argument to direct how some kind of data conversion is to be performed. That is not the purpose of this argument. This argument tells PB that the file you want to open for READING uses the specified file encoding.

Here is a key sentence from the FileOpen help topic:

If you specify the optional encoding argument and the existing file does not have the same encoding, FileOpen returns -1.

Comment

Arnd Schmidt
Thursday, 15 August 2024 10:49 AM UTC

Yes, Daniel should check if a BOM exists or use the FileEncoding() Method to get the real encoding of the file before opening or do further processing.

Helpful 0

Roland Smith
Thursday, 15 August 2024 14:31 PM UTC

It could be that the file doesn't have BOM characters in the first two bytes.

Helpful 0

There are no comments made yet.

Page :
1

There are no replies made for this question yet.
However, you are not allowed to reply to this question.

Please login to post a reply

You will need to be logged in to be able to post a reply. Login using the form on the right or register an account if you are new here. Register Here »

Forgot Password?

We use cookies which are necessary for the proper functioning of our websites. We also use cookies to analyze our traffic, improve your experience and provide social media features. If you continue to use this site, you consent to our use of cookies.