[rfa/dwarf] Support for attributes pointing to a different CU

Jim Blandy jimb@redhat.com
Tue Oct 5 05:07:00 GMT 2004


Daniel Jacobowitz <drow@false.org> writes:
> On Mon, Oct 04, 2004 at 04:14:24PM -0500, Jim Blandy wrote:
> > 
> > Daniel Jacobowitz <drow@false.org> writes:
> > > On Wed, Sep 29, 2004 at 12:49:35PM -0500, Jim Blandy wrote:
> > > > Since we never toss types anyway, would it make sense to move
> > > > type_hash to dwarf2_per_objfile?
> > > 
> > > I don't think so.  type_hash is used in two ways: individual items are
> > > set, when we know which CU we ought to have, and a whole CU is
> > > restored, when we know which CU we're restoring.  It's always more
> > > efficient to have a lot of small hash tables if you know precisely
> > > which one you'll need; fewer collisions.
> > 
> > Really?  libiberty/hashtab.c is a resizing hash table; I thought hash
> > table resizing was supposed to keep the collision rate roughly
> > constant (modulo hysteresis) regardless of the number of elements.  If
> > that's not so, doesn't that mean your hash function isn't doing its
> > job spreading the elements across the (adequately sized) table?
> 
> Poor job of thinking on my part, there.  The rest of the paragraph
> still makes sense to me, though.  For instance, in a resizing hash
> table, I suspect that there is more copying to have one large
> expandable hash table than several small ones.
> 
> I haven't done the math for that, of course.  Maybe I've got it
> backwards.  Do you think there would be any advantage to doing it the
> other way round?

The only advantage I had in mind was simplicity, and it didn't seem
like it'd be a performance hit.

The libiberty hash table expands by a given ratio each time, which
means that, overall, the number of rehashings per element is constant
no matter how large the table gets.  It's similar to the analysis that
shows that ye olde buffer doubling trick ends up being linear time.
(I'm thinking on my own here, not quoting anyone, so be critical...)

There could be a locality disadvantage to doing it all in one big hash
table.  When the time comes to restore a given CU's types, their table
entries will be sharing cache blocks with those of other, irrelevant
CU's.  That doesn't happen if we use for per-CU hash tables: table
entries will never share cache blocks with table entries for other
CU's (assuming the tail of one table doesn't share a block with the
head of another, blah blah blah...).

I'm concerned about the legacy of complexity we'll leave.  Wrinkles
should prove they can pay their own way.  :)



More information about the Gdb-patches mailing list