1. Nhan Nguyen
  2. PowerBuilder
  3. Friday, 10 February 2023 21:51 PM UTC

Hi,

we are using the importfile function to import data from a text file into datawindows using the following code

ll_rc = dw_1.ImportFile(ls_inFile)

The import work ok but there are some character was changed after import.

Data from input file: “xv. contracts for material and expert witnesses;”

After import, data in datawindow:  â€œxv. contracts for material and expert witnesses;â€

I think this is about Unicode.  I try to replace “ with ". but didn't work

Any information about this is very much appreciated     

Nhan Nguyen Accepted Answer Pending Moderation
  1. Monday, 13 February 2023 18:16 PM UTC
  2. PowerBuilder
  3. # 1

Hi everyone,

Thank you very much for your suggestion  

I tried Mark's solution. 

//Open file and convert
ll_FileHandle = FileOpen(ls_inFile, StreamMode!, FileAccessMode, FileLockMode, FileWriteMode, EncodingANSI!)
ll_rc = FileReadEx(ll_FileHandle ,lb_blob)
ls_input = String(lb_blob, EncodingUTF8!)

//Then do your ImportString()
ll_rc = dw_1.ImportString(ls_input)

However PB datawindow still show some thing different

The character “ and ” being displayed as

 in datawindow control screen

The interesting thing is: if I copy the text from datawindow control and paste it to a text editor, the “ is displayed.    

Any idea ?

Thanks,

Nhan Nguyen

   

    

Comment
There are no comments made yet.
Mark Goldsmith Accepted Answer Pending Moderation
  1. Saturday, 11 February 2023 17:38 PM UTC
  2. PowerBuilder
  3. # 2

Hi Nhan,

I understand that you believe the file in question to be ANSI encoded but the issue you're facing suggests that the file encoding is actually UTF-8 without BOM, which PowerBuilder will interpret as being an ANSI encoded file. This is reflected in the fact that the 3 characters you see at the beginning of the string are UTF-8 encoded left double-quote and the ending 2 characters are UTF-8 encoded right double-quote.

You can confirm this by opening the raw file in a Hex editor and look for these same characters, but more importantly look at the beginning of the file and check whether or not you see the following characters: EF BB BF indicating a UTF-8 with BOM encoded file, FF FE indicating a UTF-16LE encoded file and FE FF indicating a UTF-16BE encoded file. If you don't see EF BB BF but you do see “ and †surrounding your text in question then the file is encoded as UTF-8 without BOM.

If that is the case then you would have to read and import the file using something like following:

//Open file and convert
ll_FileHandle = FileOpen(ls_inFile, StreamMode!, FileAccessMode, FileLockMode, FileWriteMode, EncodingANSI!)
ll_rc = FileReadEx(ll_FileHandle ,lb_blob)
ls_input = String(lb_blob, EncodingUTF8!)

//Then do your ImportString()
ll_rc = dw_1.ImportString(ls_input)

I have not specified the FileAccessMode, FileLockMode or the FileWriteMode parameters (you can choose what you want as they really aren't relevant to your problem per se) but you cannot exclude them when you include the Encoding parameter.

HTH...regards,

Mark

Comment
  1. Mark Goldsmith
  2. Monday, 13 February 2023 02:18 AM UTC
Thanks Miguel :) As for your thoughts on how this works, I would agree. It would be nice if PowerBuilder had the ability to automatically handle files that are UTF-8 encoded without BOM, like most text editors can, but in the absence of that the options are slim. If I'm creating the text file or have control over how it's created I can ensure it's UTF-8 with BOM but if it's an outside source, like the scenario for one of my customers, they're at the mercy of the provider...cheers.
  1. Helpful 2
  1. Andreas Mykonios
  2. Monday, 13 February 2023 08:43 AM UTC
Mark's answer is really great. The main things to pay attention, it opens file in streammode and read data to blob. That way conversion to another encoding is possible.

My respect.

Andreas.
  1. Helpful 1
  1. Mark Goldsmith
  2. Monday, 13 February 2023 14:18 PM UTC
Thanks Andreas and yes, it's good to highlight those two points.
  1. Helpful
There are no comments made yet.
Nhan Nguyen Accepted Answer Pending Moderation
  1. Saturday, 11 February 2023 02:54 AM UTC
  2. PowerBuilder
  3. # 3

Hi Chris,

Thank you very much for the information. 

Actually the input file was created using datawindow function dw_1.saveas(ls_filename)  which by default use character encoding ANSI.

We notice this and change the code to UTF16 as follow:

dw_1.saveas(ls_file, Text!, false, EncodingUTF16!) 

However with the old files which still using character encoding ANSI, importing these files created problem as mentioned above.

My question is: Is it possible to change the character encoding of the old files from ANSI to UTF ?

I tried to read the file into to a blob variable and then use function FromAnsi to convert to UNICODE but it did not work.

ll_FileHandle = FileOpen(ls_inFile, TextMode!)
ll_rc = FileReadEx(ll_FileHandle, lb_blob)
ls_input = FromAnsi (lb_blob)

ll_rc = dw_1.ImportString(ls_input)

FileClose(ll_FileHandle)

Thanks,

Nhan Nguyen

    

            

Comment
  1. Benjamin Gaesslein
  2. Monday, 13 February 2023 07:34 AM UTC
That doesn't make sense to me, Chris. Why would overwriting a file force it to keep the old encoding? That's not my experience. You can absolutely open an ANSI file in a program like Notepad++, convert it to UTF-8 and just save it.
  1. Helpful 1
  1. Andreas Mykonios
  2. Monday, 13 February 2023 08:37 AM UTC
I think that's because fileopen will detect the encoding as ANSI. You cannot directly change that. Even if you try to open the file with another encoding you will see strange charactes instead of real content. A solution would be to open those files, read their contents to a string, change the encoding of that string and replace your file with a new one with correct encoding. Notepad++ does that in one step and gives you the feeling it's a simple task, but in reality you don't see what's done in the background!

Andreas.
  1. Helpful
  1. Chris Pollach @Appeon
  2. Monday, 13 February 2023 13:47 PM UTC
Correct.
  1. Helpful
There are no comments made yet.
Chris Pollach @Appeon Accepted Answer Pending Moderation
  1. Friday, 10 February 2023 22:10 PM UTC
  2. PowerBuilder
  3. # 4

Hi Nhan;

  Did you create the input text file as Unicode?

For example ....

Regards ... Chris

Comment
There are no comments made yet.
  • Page :
  • 1


There are no replies made for this question yet.
However, you are not allowed to reply to this question.