Hello,
I am having the same issue as explained above
ls_tempfile = trim(ls_TempDir) + "\" + trim(as_rte_mode) + ".htm"
rte_autrs.savedocument( ls_tempfile, FileTypeHTML! )
// recupere le contenue du fichier html qu'on vient de creer en memoire
li_fnum = FileOpen(ls_tempfile, LineMode!, Write!, LockReadWrite!, Replace!, EncodingUTF8!)
li_linestatus = FileRead(li_fnum, ls_line)
do while li_linestatus <> -100
ls_html = ls_html + trim(ls_line) + ls_cr
li_linestatus = FileRead(li_fnum, ls_line)
loop
FileClose(li_fnum)
FileDelete(ls_tempfile)
When I run this in the debugger, the li_fnum = -1
And the html file generated is in utf8. There is a é in Séguin which shows up correctly in the file.
When I use this line to open the file, li_fnum = FileOpen(ls_tempfile, LineMode!, Read!), I get li_fnum = 1 but I get the é is transformed into a different value in the string variable ls_line
<p lang="en-US" style="text-indent:0pt;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:10pt;">Bonjour salut </span><span style="font-size:10pt;font-style:italic;">HOLA</span><span style="font-size:10pt;"> Daniel Séguin</span></p>
<?xml version="1.0" encoding="UTF-8" ?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta content="TX31_HTM 31.0.1103.500" name="GENERATOR" />
<title></title>
</head>
<body style="font-family:'Arial';font-size:12pt;text-align:left;">
<p lang="en-US" style="text-indent:0pt;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:10pt;">Bonjour salut </span><span style="font-size:10pt;font-style:italic;">HOLA</span><span style="font-size:10pt;"> Daniel Séguin</span></p>
</body>
</html>
Question:
Does this mean that if I am opening an html file, even though the file appears to be utf8
I cannot use encoding parameter
therefore I cannot read the proper utf8 character
- open the file as ANSI in Streammode: filenum = FileOpen(filename, StreamMode!, Read!, LockWrite!, Replace!, EncodingANSI!)
- Read the contents into a blob: FileReadEx(filenum, blobvariable)
- convert the blob variable into a string: utf8string = String( blobvariable, EncodingUTF8! )
With this solution my é stayed as a é in the string variable.
Then I just had to call my function to replace extended letters in html-iso-8859 format.