Re: [PATCH 3/3] Add a simple array for CU abbreviations with low codes

On 12/12/2013 09:07 AM, Petr Machata wrote:
> Josh Stone <> writes:
>>  struct Dwarf_CU
>>  {
>> @@ -283,7 +285,9 @@ struct Dwarf_CU
>>    size_t type_offset;
>>    uint64_t type_sig8;
>> -  /* Hash table for the abbreviations.  */
>> +  /* Simple array for the abbreviations with low codes.  */
>> +  Dwarf_Abbrev *abbrev_array[CU_ABBREV_ARRAY_SIZE];
> This blows up Dwarf_CU from 100odd bytes to something like half the
> page, I'm not entirely fond of that.  Especially since the hash table
> that follows is an on-demand-growing structure, having half a page
> reserved (possibly on stack) just in case seems wasteful.
> I'm looking into some debuginfo files.  libc has an average of 28
> abbreviations per CU (with 108 being the most), libstdc++ an average of
> 77 (with 108 the most), gcc 88 (142), libbost_python 206 (290), vmlinux
> 112 (185).  So reserving space for 256 seems overly generous.  I'd be
> fine with a growable heap-allocated array capped at 256.  Hopefully that
> would still be helpful performance-wise.

OK, I'll experiment with less generous static sizes, on the heap, as
well as dynamic/realloced size (still capped though).

I also had the idea that maybe lookup can sometimes skip the modulus,
which was by far the hotspot instruction in lookup() by ~72%.  Just
basically moving my "is-it-a-small-code" branch into hash lookup.  So
everything would stay in the hash table, hopefully just a bit faster.

More things to try, anyway...

