[RFA] massively speed up "info var foo" on large programs

Doug Evans dje@google.com
Mon Jun 4 04:06:00 GMT 2012


On Fri, Jun 1, 2012 at 12:38 PM, Pedro Alves <palves@redhat.com> wrote:
> On 05/28/2012 05:49 AM, Doug Evans wrote:
>
>> On Fri, May 25, 2012 at 1:50 AM, Pedro Alves <palves@redhat.com> wrote:
>>> On 05/25/2012 09:21 AM, Doug Evans wrote:
>>>
>>>>
>>>> The output is different from the previous code, I didn't take into
>>>> account the symbols that gdb creates for @plt entries.  I think if we
>>>> want to continue to provide the current output, we should add an
>>>> option to "info var|fun|type" to produce it: the normal case shouldn't
>>>> be that slow.
>>>
>>>
>>> Different how?  The patch has no testsuite updates, so the email reader is
>>> left wondering.  ;-)
>>
>> In the "Non-debugging symbols" section of the output, when a symbol
>> would have been found in another objfile, the code would have not
>> printed the non-@plt form of the function name.
>> With this patch we have a decision to make.  Searching all the other
>> objfiles is not reasonable (IMO) so what to do?  I can think of two
>> possibilities: always print it or never print it.  Since the symbol in
>> question is an artificial symbol created by gdb I have opted for never
>> printing it.
>>
>> Thus instead of seeing this in the "Non-debugging symbols" section of
>> the output:
>>
>> 0x1234 foo@plt
>> 0x1234 foo
>>
>> the output will contain:
>>
>> 0x1234 foo@plt
>>
>> Here is v3 of the patch.  I added a testcase.
>> Regression tested on amd64-linux.
>>
>
>
> Took me a while, and a gdb.log diff of the new testcase between
> patched and unpatched gdb to grok this.  I think this output change
> is okay.
>
>> @@ -3463,21 +3505,13 @@ search_symbols (char *regexp, enum searc
>>               || regexec (&datum.preg, SYMBOL_NATURAL_NAME (msymbol), 0,
>>                           NULL, 0) == 0)
>>             {
>> -             if (0 == find_pc_symtab (SYMBOL_VALUE_ADDRESS (msymbol)))
>> -               {
>> -                 /* FIXME: carlton/2003-02-04: Given that the
>> -                    semantics of lookup_symbol keeps on changing
>> -                    slightly, it would be a nice idea if we had a
>> -                    function lookup_symbol_minsym that found the
>> -                    symbol associated to a given minimal symbol (if
>> -                    any).  */
>> -                 if (kind == FUNCTIONS_DOMAIN
>> -                     || lookup_symbol (SYMBOL_LINKAGE_NAME (msymbol),
>> -                                       (struct block *) NULL,
>> -                                       VAR_DOMAIN, 0)
>
> There's a comment above all this that talks about "lookup_symbol" that needs
> updating.

Yeah, already updated in my sandbox.

>> -                     == NULL)
>> -                   found_misc = 1;
>> -               }
>> +             /* Note: An important side-effect of these lookup functions
>> +                is to expand the symbol table if msymbol is found.  */
>
> Okay, it's an important side-effect, but why do we want the side effect
> should IMO be mentioned. It's not obvious to me (a casual reader).

Most of gdb's symbol handling is arguably non-obvious.   1/2 :-)
I've expanded the comment.

>> +             if (kind == FUNCTIONS_DOMAIN
>> +                 ? find_pc_symtab (SYMBOL_VALUE_ADDRESS (msymbol)) == NULL
>> +                 : lookup_msymbol_in_objfile (objfile, msymbol,
>> +                                              VAR_DOMAIN) == NULL)
>
> The "lookup_msymbol_in_objfile" function name threw me off for a few
> minutes, as in, "if we already know the msymbol exists in objfile, wth
> are we looking it up again?  For those important side-effects perhaps?"
> More below.

The comment for the function is pretty unambiguous.

>
>> +               found_misc = 1;
>>             }
>>
>> +/* Wrapper around lookup_symbol_aux_objfile for search_symbols.
>> + Look up MSYMBOL in DOMAIN in the global and static blocks of OBJFILE
>> + and all related objfiles. */
>> +
>> +static struct symbol *
>> +lookup_msymbol_in_objfile (struct objfile *objfile,
>> + struct minimal_symbol *msymbol,
>> + domain_enum domain)
>> +{
>> + const char *name = SYMBOL_LINKAGE_NAME (msymbol);
>
> Ah, MSYMBOL is the input, not the output!  The lookup_symbol_xxx
> functions return a symbol, so I was naively expecting this to return
> an msymbol.  IMO it'd be clearer if it'd be called something like
> lookup_symbol_of_msymbol_in_objfile then.  But, the only usage of MSYMBOL
> in the entire function is extracting its name. This would then be much
> simpler, clearer and reusable IMO if the prototype and name of
> the function was:
>
> static struct symbol *
> lookup_symbol_in_objfile (struct objfile *objfile,
>                          const char *name,
>                          domain_enum domain);
>
> And the caller did the SYMBOL_LINKAGE_NAME (msymbol) thing.

Actually, I had a similarly named function and prototype early on, but
I didn't like it.  GDB's symbol support is a mess (IMO), and I wanted
a name and usage to restrict it to the task at hand.  Anything more
general and I was pretty sure this patch would get bogged down.
Later on I want to revamp the symbol API, but I don't want this patch
tied down by that.
(For example, I don't want to bubble up the semantics of
demangle_for_lookup to the caller of this function.   Here we have an
msymbol, we know we have a mangled name.  If you want, I can go with
lookup_symbol_in_objfile but rename it to
lookup_symbol_in_objfile_from_linkage_name or some such.)



More information about the Gdb-patches mailing list