1. Olan Knight
  2. PowerBuilder
  3. Wednesday, 21 July 2021 17:53 PM UTC

PB2019R3, b2703
Windows 10 64 bit platform


We have a utility that performs an IMPORTFILE. If the file is too large, THEN it opens the source file and does a FileReadEx of the data line by line into the DW.

These source files are LARGE, and this process takes forever - and sometimes crashes the utility.

I want to speed this up and thought about STREAMING the data into the DW.

Is that possible? Anyone know how to do this?

 

FIleOpen (streammode)
FileReadEx(file#) into a blob
FileWriteEx () <--- ???

The source file is 210 bytes of text data with no embedded NULLs.
The end result should be the text data from the source file in the DW.


Thank You.

Accepted Answer
Roland Smith Accepted Answer Pending Moderation
  1. Wednesday, 21 July 2021 19:12 PM UTC
  2. PowerBuilder
  3. # Permalink

Try using Oracle SQL*Loader to bulk load it into a work table designed specifically for this purpose. Truncate existing rows first, then use my RunAndWait object to run the loader. Then when it completes, you can manipulate the data from the loader table into the destination tables.

Comment
  1. Chris Pollach @Appeon
  2. Thursday, 22 July 2021 12:24 PM UTC
Even faster to that Roland is Oracle's "External Table" feature. You basically define a table that matches the external file and then point the ET to the external file. No importing. Then access the file datum with standard DML verbs. Oracle's built in file system does it all transparently for you. I've used this feature before and it's super fast on G's of file data!
  1. Helpful 1
There are no comments made yet.
John Fauss Accepted Answer Pending Moderation
  1. Thursday, 22 July 2021 01:43 AM UTC
  2. PowerBuilder
  3. # 1

Olan - 

Instead of 32K "chunks", have you considered using FileReadEx to read, say, for example, 100,000 rows worth of data with one FileReadEx call, writing out that larger chunk with one FileWriteEx into a temp file, import and process those 100,000 rows, then wash, rinse, repeat?

Best regards,
John

Comment
  1. Roland Smith
  2. Thursday, 22 July 2021 12:42 PM UTC
The problem with 'chunks' is that it will probably end in the middle of the last row.
  1. Helpful
  1. John Fauss
  2. Thursday, 22 July 2021 14:41 PM UTC
ImportString is a good suggestion, Rene! Roland, if each line of data in the source data file can be of varying length, then yes, certainly, the logic that reads the source data file would have to be able to take that complication into account. However, if each line in the source data file contains exactly 210 bytes of data (as Olan states in his question) and assuming each line is delimited by a CR/LF (that would need to be verified, of course), then each "chunk" of 100,000 rows (for example) occupies (210 + 2) * 100000 = 21,200,000 bytes.

I haven't had to do this in quite some time, so one may also have to take the Byte-Order Marker (BOM) at the start of the file into account if reading the file in stream mode.

If importing the entire file is not workable due to its size, if reading/importing line by line is not workable due to the sheer quantity of records/lines, and no native "bulk" data operation in the database is available, then what alternatives are there?
  1. Helpful
  1. Miguel Leeuwe
  2. Thursday, 22 July 2021 15:22 PM UTC
Good points all,

To avoid ending in the middle of the last line, your 'chunk' would have to be a multiple of the linelenght. Not sure if that would work as you might have different encoding and a row with characters like "£" might end up having a somewhat longer line length.

I guess the most efficient way would be to use the DB utilities or maybe even a Data Pipeline object could be a solution here (if speed is not the most important factor).
  1. Helpful
There are no comments made yet.
Olan Knight Accepted Answer Pending Moderation
  1. Wednesday, 21 July 2021 19:49 PM UTC
  2. PowerBuilder
  3. # 2

Chris & Roland, thank you for your quick responses!


Roland -
   I'm currently using Oracle12C but will be migrating to PostgreSQL. I cannot use any tool other than PowerBuilder or Java; I'd rather use PB. Can I create a LOAD table in Oracle to accept the blob? How should the table be structured? My target DWO structure is listed below.


Chris -
That I understand, but I'm not sure about writing that 32K Chunk O' Data into the DWO. Can I write it into a dwo comprised of a various columns? If so will the data be converted from blob back into the correct formats? In other words can I just FileWriteEx(the_blob) into a dw control that uses the DWO below?


Here's the DWO structure:

Comment
  1. Chris Pollach @Appeon
  2. Wednesday, 21 July 2021 21:51 PM UTC
Roland's DBMS utility LOAD suggestion is the best. I hope that PostGreSQL must have something like that as well. Normally the DBMS utilities are the best from my experience when loading large result sets of datum.
  1. Helpful
  1. Roland Smith
  2. Thursday, 22 July 2021 03:53 AM UTC
PostGreSQL has a COPY SQL command that can be used to bulk load data into a table or export from a table.

https://www.postgresql.org/docs/10/sql-copy.html

Example:

COPY tablename FROM 'filename';

The file must be on the server where the database is running.
  1. Helpful 1
There are no comments made yet.
Chris Pollach @Appeon Accepted Answer Pending Moderation
  1. Wednesday, 21 July 2021 18:13 PM UTC
  2. PowerBuilder
  3. # 3

Hi Olan;

  FWIW:  If you use the FileRead() command you can read large files in 32K "chunks". I am not sure if that helps you in this situation. Just an FYI.

Regards ... Chris

Comment
There are no comments made yet.
  • Page :
  • 1


There are no replies made for this question yet.
However, you are not allowed to reply to this question.