[WIP] New dwarf2 reader - updated 07-02-2001

Daniel Berlin dan@cgsoftware.com
Mon Jul 30 23:06:00 GMT 2001


Jim Blandy <jimb@zwingli.cygnus.com> writes:

> (I moved last week; finally have time to look at the new reader.)
> 
> Here are some questions that came to mind on first reading.  I haven't
> read everything thoroughly yet, so I could well be just
> misunderstanding what's going on.
> 
> - MD5 generates a 128-bit checksum.  Any reason you only use the first
>   32 bits in your hash?  It looks like checksum_die and
>   checksum_partial_die both cut it off at eight digits.
I use the first 64 bits, just like GCC does for checksumming DIE's and
duplicate elimination.  Though i'll give you gcc also includes the
name attached to the die. I'll happily make the two algorithms
*exactly* the same if you like.


> 
> - Why are you using a splay tree keyed by MD5 hash values, instead of
>   a hash table?  The only advantage of a splay over a hash table is
>   ordered retrieval, but your keys don't have any meaningful order.
I hate the hash table interface for libiberty, it's annoying to use,
IMHO.
Other than that, no reason. I was meaning to make a nice generic
dictionary interface, and talked with DJ about whether i could just
use libdict, but since it's not GPL, and i don't have time to
duplicate it's nice interface right now, i just went with what was
easiest to program.
> 
> - By using the MD5 checksum as your search key, you're guaranteed to
>   lose if you get an MD5 collision.  I agree that a collision is very
>   unlikely (or would be, if you were using all 128 bits), but it seems
>   icky.

Well, if it makes you feel less uneasy, i'll move to the 128 bits or make
the two algorithms (gcc's, and gdb) exactly the same.
Your chances of colliding are still so amazingly small it's not even
funny.

Considering it was thought to be good enough for global duplicate
elimination by the DWARF2.1 committee, i think it's good enough for
us.
Besides, you'd need more than one collision in a single file to have
any real effect on the debug info.

> 
> - Assuming MD5 is perfect, you don't checksum all attribute forms
>   (DW_FORM_flag, for example).  This means that you can get false
>   duplicates, if two dies differ only in (say) a flag's value.  How is
>   read_comp_unit_dies set up to tolerate that?
I only checksummed the  stuff gcc generates (since i based the
checksumming code on gcc's), and I forgot that gcc
doesn't generate some forms.  


> 
> - You never clear dwarf2_symbol_splay.  This might be okay if you
>   checksummed the complete contents of the die (I'm not sure about
>   that), but you don't include block contents in the checksum ---
>   process_attribute includes the pointer to the `struct dwarf2_block'
>   in the checksum, not the block's actual contents.  So if we have two
>   dies which differ only in their block contents, and the `struct
>   dwarf2_block' objects which hold those contents happen to get
>   allocated at the same address, by different calls to
>   dwarf2_psymtab_to_symtab, you'll get a false match again.
Bug.
On a side note, though, I don't think i've ever seen dies that differ only in
block content, as the things that have attributes for which we can have blocks
(values, locations, bounds) make this *highly* unlikely to occur when you
factor in the offsets.
Probably why i never noticed. 
Probably should make up a gas file that generates such a few dies, to
make sure we do the right thing.


> 
> - In read_comp_unit_dies, when you find a duplicate die, you skip to
>   its sibling.  What if the parent die is identical, but the children
>   dies differ?

I don't believe this is possible in any language.
It would require something to have the same everything, but be
different. I.E. same name, same location in memory, same tag, same
declaration line number, same type, same offsets for everything, etc,
different thing. 
If you like, i'll recursively process the child dies and include tehm
in the checksum, but i think it's pointless.
> 
> Can you set me straight on this stuff?


-- 
"Do you think that when they asked George Washington for ID that
he just whipped out a quarter?
"-Steven Wright



More information about the Gdb-patches mailing list