This is the mail archive of the guile@cygnus.com mailing list for the guile project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: avl-trees vs. hashes

To: telford@eng.uts.edu.au
Subject: Re: avl-trees vs. hashes
From: Per Bothner <bothner@cygnus.com>
Date: Tue, 27 Oct 1998 22:06:53 -0800
cc: guile@cygnus.com

> Leaving a mark has the problem that a lot of insertions and deletions
> which don't substantially change the number of keys (i.e. which don't
> force a rehash) cause all the empty slots to change over to marks.

I'd be curious to know the results of a mathematic analysis of this.
First, deleted slots are of course equivalent to empty slots for
insertion (once we have verifyed that the key is not already in
the table).  Secondly, in practice, I think most tables have a
relatively low number of deletions.

Finally, you only need to leave a mark when the "next" element
in a scan is non-empty.  That is easy to figure out when doing
a linear scan, but of course that is bad because of clustering.
If scans are semi-random (i.e. a secondary hash), then the
figuring out if a deletion mark is needed may be impractical.

> Generally the non-linked hash table is more memory efficient
> but slower (considering garbage collection overhead, maybe faster).

But note that because the non-linked hash table saves space,
you can (and in fact should) allocated more slots/buckets,
and hence collisions are less likely.

In any case, the various tradeoffs are complicated, and beyond my
knowledge and ability to analize.  I'd be interested in a overview
of the various results, written by somebody familiar with the
literature, but written for a practitioner needing recommendations,
not theorems.

Another complicationpoint:  There are hash algorithms that are
guaranteed *always* (i.e. not amortized) O(1) (assuming the hashing
is appropriately random).  They do this by resizing incrementally.
I vaguely remember one of these algrithms is called additive
hashing.  A Computing Surveys-type article would be great.

	--Per Bothner
Cygnus Solutions     bothner@cygnus.com     http://www.cygnus.com/~bothner

References:
- Re: avl-trees vs. hashes
  - From: Tel <telford@eng.uts.edu.au>

Prev by Date: Re: Scheme is too complicated
Next by Date: Re: require vs use-modules
Prev by thread: Re: avl-trees vs. hashes
Next by thread: Re: avl-trees vs. hashes
Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]