1. Dan Cooperstock
  2. PowerBuilder
  3. Sunday, 28 March 2021 13:53 PM UTC

Some years ago I saw the value in having an object that implements dictionary / associative array / map type functionality, mapping string names to arbitrary values, including strings, numbers, dates, and objects.

That could be use for passing sets of values into windows via OpenWithParm(), without having to create a different structure for each different set of values, or using some more generic structure that doesn't have named values. Sometimes I also used it for passing sets of values directly into functions, without having to have very long lists of arguments.

I came up with an implementation using a DataStore with a String key and Long index, where the index indexed into an array of Any, that held the actual values. There would have been no way to hold arbitrary values directly in a DataStore column. And I needed the index column so that I could sort the DW on the names, and the indexes into the Any array would still be correct.

Although I don't entire recall my line of thinking, When I wrote this years ago, I must have been worried that a DW Find() call to find the keys in the DataStore could be too slow, since I assumed it was basically a linear search. (One reason that was dumb is that I was almost always using a very small number of keys in the object.)

So to work around that worry, I wrote my own binary search on the keys in the DS, with a fallback linear search (using GetItem calls) in case somehow I had gotten to that lookup function without sorting the DS (which I didn't do after every add, for efficiency, but only at certain points).

The other day, it occurred to me to find out whether using Find() instead was really that much slower. I wrote some code for that, that inserted a large number of keys and values (1200), then did millisecond time of looking up each of the values in turn in the index. The binary search method took 27 milliseconds, while the Find() took 38. Obviously acceptable!

In addition I thought I should check that they were actually getting the same results, which I did by adding up the indexes returned in all of those lookups, using the two methods. Oh oh - the results were different! So I re-routed the code to use my hand-coded linear search using GetItem, and it got the same total as the Find() method. That made me suspect errors in the binary search.

Some debugging into the binary search code found that some key values that were in the DS were not being found! And then some re-reading of the DW Sort() method's Help reminded me that String sorts on DW columns are not in ASCII order, but rather "lexical order". That means that the comparisons that Sort() uses are NOT the same as the results from PB's String comparisons, < and >, which I was using in the binary search code! Bad bug, present for years, but I think we had been fortunate to not use key values in which the bug would show up!

My next thought was about the Chilkat bundle we use for a lot of other functions. It's a fabulous bundle of almost 100 well-designed and powerful ActiveX controls that we first adopted years ago for its strong SMTP support, then found more and more uses for. It's only 10MB, and very affordable, with royalty-free distribution. (No, I'm not an employee or paid to say this, just an extremely happy customer!) A list of the controls, with links to the docs for each, is at https://www.chilkatsoft.com/refdoc/activex.asp

One of the controls I had never used in that bundle was a HashTable, which of course is a traditional implementation for dictionary-type objects in programming languages. It occurred to me to use it to replace the DataStore, since it would be doing something more like a clean binary search rather than the linear search that the DS Find() does have to do. Do I coded that, and guess what - 15 milliseconds for the same 1200-item test, almost twice as fast as my broken binary search and more than twice as fast as Find().

Admittedly I don't believe we ever have that many values in this object, and speed really wasn't a concern. But still it was a fascinating adventure through this code, and a reminder of the saying "When all you have is a hammer, everything looks like a nail." In other words, a DataStore / DataWindow is not always the best solution for every single problem in PB!

Chris Pollach @Appeon Accepted Answer Pending Moderation
  1. Monday, 29 March 2021 18:32 PM UTC
  2. PowerBuilder
  3. # 1
0
Votes
Undo

Hi Dan;

  I have written my own DC/DS BTREE Search (find) but, that would make a great new DC/DS PB feature IMHO. So much more efficient to locate random values when your key data is sorted.  ;-)

Regards ... Chris

Comment
dan, thanks for publishing the benchmark results (1200 lookups in 1200 rows = 37 milliseconds). i've wondered whether it is worth doing a sort/binary search after a certain number of rows are cached.
  1. mike S
  2. Monday, 29 March 2021 20:51 PM UTC
Hmm, I once read the the powerbuilder Find() function was blazing fast!
  1. Miguel Leeuwe
  2. Wednesday, 31 March 2021 15:46 PM UTC
Lots of times, I've found code that loops through all rows of a dw/ds and then checks on several field values to do some operation. It's way faster to enter a loop of find()s and do the operation.

just my 2cts
  1. Miguel Leeuwe
  2. Wednesday, 31 March 2021 16:15 PM UTC
There are no comments made yet.
  • Page :
  • 1


There are no replies made for this question yet.
However, you are not allowed to reply to this question.