1. Don Yang
  2. PowerBuilder
  3. Monday, 11 April 2022 17:55 PM UTC

Hi, All

I have a question about the encode html text stored in database table. Some encoded html text (with html tags) was inserted into our database table text column from the external source. When I load the text into datawindow, how can I decode the html text (get rid of all the html tags) but still keep all the line breaks? 

Thanks,

 

Don

Andreas Mykonios Accepted Answer Pending Moderation
  1. Tuesday, 12 April 2022 14:36 PM UTC
  2. PowerBuilder
  3. # 1

If what you need is to "transform" an html to plain text, this may work using the new rich text control that is available in PB 2019 R3.

You will have to modify properties in your application object like that:

You will then be able to use it. You can load your html using InsertDocument function! You can then save it as text using SaveDocument function.

InsertDocument - - PowerScript Reference (appeon.com)

SaveDocument - - PowerScript Reference (appeon.com)

You can also build your own method to remove html tags. But depending on how familiar is html syntax for you it can be easy or hard - complex.

Andreas.

Comment
There are no comments made yet.
Chris Pollach @Appeon Accepted Answer Pending Moderation
  1. Monday, 11 April 2022 19:09 PM UTC
  2. PowerBuilder
  3. # 2

Hi Don;

   You can decode it for example through a DWO Global Function or a Computed Column but the key will be to know how the datum was encoded in the first place so that you can reverse engineer it back to its original form. Your DBMS might also have this capability as well via using a built-in function / SF/ SP as well and thus you could send it to the DWO decoded within the normal result set.

Regards ... Chris

Comment
  1. Andreas Mykonios
  2. Tuesday, 12 April 2022 06:23 AM UTC
The example you provide seems to be plain html... Are you sure it is encoded? In this example the only think that I see is the back slash just before each quote. This can be removed. I guess you have to escape quotes when storing in db.

Andreas.
  1. Helpful
  1. Don Yang
  2. Tuesday, 12 April 2022 13:42 PM UTC
Sorry, this was actually inserted into the database by third party client:



<p><span style="font-family:Verdana, Geneva, sans-serif;font-size:10pt;">description of change</span></p><p><span style="font-family:Verdana, Geneva, sans-serif;font-size:10pt;">this is the test</span></p><p><span style="font-family:Verdana, Geneva, sans-serif;font-size:10pt;">for multi line </span></p><p><span style="font-family:Verdana, Geneva, sans-serif;font-size:10pt;">& </span></p>

is there a way in PB to strip off these html tags?
  1. Helpful
  1. Brad Mettee
  2. Tuesday, 12 April 2022 14:47 PM UTC
It sounds more like you're looking for an HTML parser, not a decoder. Each paragraph (p tag) represents one line, each span represents one element on that line. Is that about right? If so, you're easiest way will be to parser for P first, then each span within, and dump the data into your datawindow. I don't know of any DLL's offhand that can do HTML parsing like you need, but doing it in PB isn't that hard (I've done similar).
  1. Helpful
There are no comments made yet.
  • Page :
  • 1


There are no replies made for this question yet.
However, you are not allowed to reply to this question.