1. Markus Schröder
  2. PowerBuilder
  3. Monday, 26 April 2021 14:41 PM UTC

Hello,

I need to find and highlight a user entered value in a large datawindow (~280.000 rows).

Therefore I created a findstring (upper(column1) = value OR upper(column2) = value OR ...).
The search takes about 15 seconds, especially when the value is at the end of the datawindow.

Is there a faster way to search for a value?

John Fauss Accepted Answer Pending Moderation
  1. Wednesday, 6 April 2022 02:16 AM UTC
  2. PowerBuilder
  3. # 1

Hi, Markus - 

If the current search process is able to wade its way through 280,000 rows with multiple OR conditions in only 15 seconds, you should be thrilled with that level of performance.

What's to keep the user from retrieving, then searching say, 500,000 or 1,000,000 rows (aside from running out of process memory, as Miguel mentioned)?

With this much data and a user that has no feel for their data, I think a different user interface design would be better. What if you provided a "search criteria" data input DataWindow in the upper portion of a window that is used to collect the user's search values (a value for each potential result set data column), then below that, a read-only grid that will display the results? The SELECT statement for the grid doesn't create a WHERE clause, so if the user retrieves without supplying any search criteria, they get all of the data (more on that soon). If the user supplies search criteria value(s), it's relatively easy to dynamically modify the grid DW's SELECT statement to append the appropriate WHERE clause conditions and then... the DBMS performs the search and only the rows that satisfy the search criteria are retrieved/displayed! Well-designed indexes in the database can significantly boost search/retrieval performance.

We've done this throughout our app and it works really well. The user can search for the desired value in column "A"... if no hits, they can try column "B", and so on, or any AND'ed combination of columns.

When there is the possibility that a large result set (10,000 rows or more, for example) will be retrieved, the window first builds a SELECT statement that does a "SELECT COUNT(*) FROM table1... WHERE search-criteria..." to see how many rows will be retrieved. This should be VERY efficient for the database to query. If that count comes back and it exceeds the "too large" threshold, the user gets prompted: "280,000 rows will be retrieved. Do you wish to continue? Yes/No/Cancel".

I realize this may sound overwhelming, but the code to implement all of this is pretty easy.

Nearly all databases can search efficiently, so this design allows it to do the searching. Large result sets can be aborted before the actual retrieve takes place, and only the data that matches the user's search criteria gets transmitted from the database to the app. The user can search to his/her/its heart's content with decent performance while avoiding the client-side searching of large result sets.

Maybe something to consider?

Regards, John

Comment
There are no comments made yet.
Tracy Lamb Accepted Answer Pending Moderation
  1. Tuesday, 5 April 2022 22:05 PM UTC
  2. PowerBuilder
  3. # 2

Do you use the pfc libraries in your project?  If you do, you can add the Query service to the datawindow.  Then, you can put the dw in query mode, let the user put whatever he/she wants to query in any of the columns, and retrieve.  I've enhanced mine a little bit to respond to the F9 key for query parameters... the user can enter his query parameter, press F9, then click LIKE% , LIKE %%, etc.  

I also put a "stop" in the dw's RetrieveRow event to stop after every 250 rows and ask the user if he want more rows...

I know this is a very late response, I was searching for help on something else. Maybe this is helpful.

~~~Tracy

 

Comment
There are no comments made yet.
Markus Schröder Accepted Answer Pending Moderation
  1. Wednesday, 28 April 2021 06:03 AM UTC
  2. PowerBuilder
  3. # 3

Thanks again for all the suggestions.

I know that it would be the best not to retrieve so many rows. However, the user often does not know what to look for and uses the result set like an Excel spreadsheet in which he can find values via CTRL-F.

I tried different strings on the datawindow find function (concatination of all columns and pos function, pos function on every single column, like on every single column), but no real speedup.

Since the data in the datawindow is in memory, I thought this would be the quickest way to search here.
But maybe it's better to run a select on a temporary table in the background (although that adds a bit of overhead).

Comment
There are no comments made yet.
Miguel Leeuwe Accepted Answer Pending Moderation
  1. Tuesday, 27 April 2021 15:04 PM UTC
  2. PowerBuilder
  3. # 4

I'm not sure if your application is 32 or 64 bits. But if 32 bits, your 280.000 rows probably use up too much memory and file memory paging might kick in.

 

Comment
There are no comments made yet.
Roland Smith Accepted Answer Pending Moderation
  1. Tuesday, 27 April 2021 12:39 PM UTC
  2. PowerBuilder
  3. # 5

Why retrieve 280,000 rows and then search the resultset? It would be much faster to query the database with the selection criteria and only retrieve the row you are looking for.

Comment
There are no comments made yet.
Marc Wietscher Accepted Answer Pending Moderation
  1. Tuesday, 27 April 2021 06:12 AM UTC
  2. PowerBuilder
  3. # 6

Hi Samuel,

try this one:

create a computed field in your dw in which you concatenate all your columns, in which you want to find your searchstring, seperated by a delimiter and converted to upper case. Embrace the computed field with your delimiter.

Example: '|'+upper(column1)+'|'+upper(column2)+...+'|'

Make sure you take care of null values to prevent your computed field from becoming null.

Then make use of the datawindow expression pos(...,...) in your search. Embrace your searchstring with your delimiter if you want to search for the whole phrase or leave out the delimiter if your phrase may be part of a column.

You could also use binary search, like mentioned by others before. But only if you search for a value in one column only, as you need to sort by this column first. So in your case you need to sort by column1, apply binary search on column1, when no result then sort by column2, apply binary search on column2 and so on.

When using binary search on string columns/computes that contain special characters other than letters and numbers (eg. underscores) be aware that ORDER BY in your SQL may give you different results than dw.sort(), depending on the DBMS you are using, of course.

So always use dw.sort() before applying binary search on a dw/ds or your string comparisons may fail (or rather return unexpected results).

 

Good luck and best regards,

Marc

Comment
There are no comments made yet.
Markus Schröder Accepted Answer Pending Moderation
  1. Tuesday, 27 April 2021 05:18 AM UTC
  2. PowerBuilder
  3. # 7

Hello,

thanks for all the responses and hints.

The search should be a general function for any datawindow.
The user retrieves any datawindow, then searches for a value in all columns/rows by pressing CTRL-F and typing the value in an input box (instant search like in chrome browser, found row and column is highlighted).

I think i try to speed up my function with a binary search in sorted datastore.

Comment
  1. Chris Pollach @Appeon
  2. Tuesday, 27 April 2021 17:33 PM UTC
Hi Samuel;

FWIW: The other alternative that I have used is to write the DWO's result set to a Temporary table where the key searchable columns are all indexed. Then when the App user performs a search, I turn that into a Retrieve() on the DC/DS but redirect the SQL in the SQLPreview event to the TEMP table with the appropriate search / filter column's values in an amended Where clause. Super fast. Just another idea.

Regards ... Chris
  1. Helpful
There are no comments made yet.
Olan Knight Accepted Answer Pending Moderation
  1. Monday, 26 April 2021 20:07 PM UTC
  2. PowerBuilder
  3. # 8

I'm curious about how long it takes to populate that DW with 280,000 rows. My experience telss me that it would take quite a bit of time.

If you absolutely MUST access all that data at one time, then I'd find another way to display the data, and like Andreas said, use SQL to do the specific FIND.

Options:
1)  Display the block of data in a parent dw, and the individual rows in a child DW, the child DW is populated when the user clicks on the appropriate row of the parent DW.

2) Display a search box and only bring up rows that relate to the user-specified SEARCH data. Store the last used SEARCH in the Registry and use it as the next default value in that search box.


Later -

Olan

Comment
There are no comments made yet.
Chris Pollach @Appeon Accepted Answer Pending Moderation
  1. Monday, 26 April 2021 18:40 PM UTC
  2. PowerBuilder
  3. # 9

Hi Samuel;

  FWIW:  For these type of fast searching on super large datum, I have typically written my own "Binary" (aka BTREE) search in PowerScript.

Regards ... Chris

Comment
There are no comments made yet.
Andreas Mykonios Accepted Answer Pending Moderation
  1. Monday, 26 April 2021 15:19 PM UTC
  2. PowerBuilder
  3. # 10

Hi.

You are using or. So it is pretty possible that more than one criteria could match. Except if your columns have distinct values. But you want only the first one?

Have you tried to search using a datastore? Of course the best solution would be doing that using sql... 280.000 rows is a big number, but this may not be the only reason you have to wait so much.

Anyway, if you want more precise answers you should provide more information, ex: number of columns in you datawindows, datatypes of columns you will use in find e.t.c.

Andreas.

Comment
There are no comments made yet.
  • Page :
  • 1


There are no replies made for this question yet.
However, you are not allowed to reply to this question.