This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: hash table subtleties


Tel <telford@eng.uts.edu.au> writes:

 > > The magic ratio is chosen so that the expected mean nonempty bucket size
 > > after upsize is the same as it is after downsize.  (and then, e.g. if a
 > > user inserts just enough entries for the table to grow to 4096 buckets, he
 > > must delete half of them before it will shrink back down to 2048). 
 > 
 > This sounds like the right idea, it will prevent overly frequent resize.
 > I don't follow the reasoning behind your magic ratio but if it works then
 > it works.
 > 
 > > I look at the number of nonempty buckets rather than the total number of
 > > buckets since empty buckets have no effect on lookup, re-insertion, and
 > > deletion (assuming you're using keys that already exist in the table).
 > 
 > Hmmm, since you resize by powers of two, each single bucket in the small
 > table splits into exactly two buckets in the large table so a bucket that is
 > empty in the small table is guaranteed to stay empty after a grow.

Good points (including the one's I've deleted).  But, I'd consider
this a good reason why one should use prime sized hash tables - then
each bucket is more likely to be randomly distributed throughout the
hashtable when resizing instead of just getting split in 2.  OTOH,
it's not clear whether doing this & losing the bitvectors is better or
worse than using powers of 2 & not losing the bitvectors.

-- 
Harvey J. Stein
BFM Financial Research
hjstein@bfr.co.il