[patchv2 2/2] Accelerate lookup_symbol_aux_objfile 14.5x [Re: [patch 0/2] Accelerate symbol lookups 15x]

Doug Evans xdje42@gmail.com
Fri Oct 24 07:16:00 GMT 2014


On Thu, Oct 23, 2014 at 11:24 AM, Jan Kratochvil
<jan.kratochvil@redhat.com> wrote:
> On Wed, 22 Oct 2014 10:55:18 +0200, Doug Evans wrote:
>> For example, the count of calls to dict_hash before/after goes from 13.8M to 31.
>> I'm guessing one t hing we're doing here is coping with an artifact of
>> dwz:
>
> During my simple test on non-DWZ file (./gdb itself) it went 3684->3484.
>
> The problem is that dict_cash(val) is called for the same val for each block
> (== symtab).
>
> On DWZ the saving is probably much larger as there are many more symtabs due
> to DW_TAG_partial_unit ones.

Much larger indeed.

One worry I have is that while this helps dwz does it harm something else.
Seems unlikely, but some simple measurements I took make me want to take more.

One thought I have is that significant changes at a higher level will
minimize the impact of this patch.  One change I'm thinking of making
is replacing iterating over every symbol table and then if that fails
going to the index with instead just going straight to the index: the
index knows where the symbols are (you mentioned this as well).
Perhaps not what we want to do for partial syms (though maybe partial
syms could work similarly, haven't gotten that far).  But with an
index it seems clumsy to iterate over all symtabs and then go to the
index.

It should be easy enough to do a quick hack to do an experiment to
collect some data.
I'll try to get to it this weekend.

>> what was once one global block to represent the entire objfile is
>> now N.
>
> Without DWZ there are X global blocks for X primary symtabs for X CUs of
> objfile.  With DWZ there are X+Y global blocks for X+Y primary symtabs for
> X+Y CUs where Y are 'DW_TAG_partial_unit's.

Yep.

> For 'DW_TAG_partial_unit's (Ys) their blockvector is usually empty.  But not
> always, I have found there typedef symbols, there can IMO be optimized-out
> static variables etc.
>
>
>> [I'm sure the patches help in the non-dwz case, but I suspect it's less.
>> Which isn't to say the patches aren't useful.
>> I just need play with a few more examples.]
>
> I agree.
>
> [patch 2/2] could needlessly performance-regress non-DWZ cases, therefore
> I have put back original ALL_OBJFILE_PRIMARY_SYMTABS (instead of my
> ALL_OBJFILE_SYMTABS) as it is perfectly sufficient.  For the performance
> testcase of mine:
>
> Benchmark on non-trivial application with    'p <tab><tab>':
> Command execution time:   4.215000 (cpu),   4.241466 (wall) --- both fixes with new [patch 2/2]
> Command execution time:   7.373000 (cpu),   7.395095 (wall) --- both fixes
> Command execution time:  13.572000 (cpu),  13.592689 (wall) --- just lookup_symbol_aux_objfile fix
> Command execution time: 113.036000 (cpu), 113.067995 (wall) --- FSF GDB HEAD
>
> That is additional 1.75x improvement, making the total improvement 26.8x.
>
>
> No regressions on {x86_64,x86_64-m32,i686}-fedora21pre-linux-gnu in standard
> and .gdb_index-enabled runs.  Neither of the patches should cause any visible
> behavior change.
>
>
> Thanks,
> Jan
>
> gdb/
> 2014-10-23  Jan Kratochvil  <jan.kratochvil@redhat.com>
>
>         * symtab.c (lookup_symbol_aux_objfile): Use ALL_OBJFILE_SYMTABS, inline
>         lookup_block_symbol.
>
> diff --git a/gdb/symtab.c b/gdb/symtab.c
> index c530d50..da13861 100644
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -1657,15 +1657,25 @@ lookup_symbol_aux_objfile (struct objfile *objfile, int block_index,
>    const struct block *block;
>    struct symtab *s;
>
> +  gdb_assert (block_index == GLOBAL_BLOCK || block_index == STATIC_BLOCK);
> +
>    ALL_OBJFILE_PRIMARY_SYMTABS (objfile, s)
>      {
> +      struct dict_iterator dict_iter;
> +
>        bv = BLOCKVECTOR (s);
>        block = BLOCKVECTOR_BLOCK (bv, block_index);
> -      sym = lookup_block_symbol (block, name, domain);
> -      if (sym)
> +
> +      for (sym = dict_iter_name_first (block->dict, name, &dict_iter);
> +          sym != NULL;
> +          sym = dict_iter_name_next (name, &dict_iter))
>         {
> -         block_found = block;
> -         return fixup_symbol_section (sym, objfile);
> +         if (symbol_matches_domain (SYMBOL_LANGUAGE (sym),
> +                                    SYMBOL_DOMAIN (sym), domain))
> +           {
> +             block_found = block;
> +             return fixup_symbol_section (sym, objfile);
> +           }
>         }
>      }
>
>

This breaks an abstraction boundary, IWBN to preserve it.
[IOW, I look at dict_* as being an implementation detail of blocks.]

If we were to go this route (and apologies for the delay), can you
write a routine like lookup_block_symbol which does the above and call
that here instead?

lookup_block_symbol should live in block.c, not symtab.c.
That's where this new routine should go too.



More information about the Gdb-patches mailing list