Bug 12707 - physname regression: set print demangle off
Summary: physname regression: set print demangle off
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: symtab (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: 10.1
Assignee: Tom Tromey
URL:
Keywords:
: 16066 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-04-27 14:57 UTC by Jan Kratochvil
Modified: 2020-04-24 21:39 UTC (History)
6 users (show)

See Also:
Host:
Target: x86_64-unknown-linux-gnu
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jan Kratochvil 2011-04-27 14:57:46 UTC
void f () { } int main () { f (); }

gcc -g

(gdb) break f
Breakpoint 1 at 0x4004d8: file .../demangle.C, line 1.
(gdb) run

pre-physname:
(gdb) set print demangle off 
(gdb) frame
#0  _Z1fv () at .../demangle.C:1
    ^^^^^^^^

post-physname:
(gdb) set print demangle off 
(gdb) frame
#0  f () at .../demangle.C:1
    ^^^^

there is a regression by the patch (not verified this specific commit has
regressed it but I guess so):
        42284fdf9d8cdb20c8e833bdbdb2b56977fea525
        http://sourceware.org/ml/gdb-cvs/2010-03/msg00082.html
        dwarf2_physname patchset:
        [RFA] dwarf2_physname FINAL
        http://sourceware.org/ml/gdb-patches/2010-03/msg00220.html
Comment 1 Tom Tromey 2013-01-14 14:31:11 UTC
I looked into this and I'm working on a patch.
The first problem is that the mangled name is no longer stored.
This is pretty easy to fix but then there is some
unrelated fallout.
Comment 2 Tom Tromey 2013-02-01 21:35:11 UTC
The thread starts here:

http://sourceware.org/ml/gdb-patches/2013-01/msg00287.html

Jan is probably correct that we should store 3 forms
for the template symbols in question:

http://sourceware.org/ml/gdb-patches/2013-01/msg00381.html


A more complicated approach is possible, based on
checking the name-without-return-type, and then filtering
the results.  However this seems more difficult to get right.

A third idea would be to introduce a kind of alias symbol
that points to the "canonical" one.  This might help avoid
some data structures issues -- e.g., minsyms assume right
now there can only be 2 names for a symbol, as minsyms
only have 2 "next" pointers.
Comment 3 Tom Tromey 2013-02-08 13:35:01 UTC
Another idea occurred to me - always store just the
mangled form in the symbol, and have a table mapping demangled form(s) to the
mangled form.
Then when printing a symbol, demangle on demand, if needed.
Comment 4 Jan Kratochvil 2013-10-19 18:38:12 UTC
*** Bug 16066 has been marked as a duplicate of this bug. ***
Comment 5 dje 2013-10-19 19:40:04 UTC
Bug 16066 is more about the python API issue.
Let's not drop that part.
Comment 6 Tom Tromey 2013-11-21 19:03:45 UTC
I looked into my idea from comment #3 a bit.

It occurred to me that we could hack the demangler a bit to return
both the full demangled name and a pointer to the part just after the
return type.

Then we could enter both of these strings into a hash table mapping
from the demangled forms back to the mangled name.

Symbol lookup would proceed by first looking up a name in this map,
and, if found, using the mangled form as the search key.

This would let us remove the demangled hash entirely from minimal
symbols.

To get the reverse direction to be efficient we could make a second
hash table, mapping the mangled form to a canonical demangled form.


It all seems doable, maybe not even too hard.  And it has some
benefits for users.  However I hesitated to follow through because I
am concerned it might be taking the symbol tables in the wrong
direction.
Comment 7 Tom Tromey 2014-06-13 19:34:57 UTC
(In reply to Tom Tromey from comment #6)
> I looked into my idea from comment #3 a bit.

> Symbol lookup would proceed by first looking up a name in this map,
> and, if found, using the mangled form as the search key.

For completion we can iterate over the names in the hash table.

> This would let us remove the demangled hash entirely from minimal
> symbols.
> 
> To get the reverse direction to be efficient we could make a second
> hash table, mapping the mangled form to a canonical demangled form.

If we needed this direction (not clear) then we could just re-run
the demangler anyway.  Or we can do like the current code, where the
name in the symtab actually points to an object in the hash.

> It all seems doable, maybe not even too hard.  And it has some
> benefits for users.  However I hesitated to follow through because I
> am concerned it might be taking the symbol tables in the wrong
> direction.

I'm not as concerned about this any more.

One wrinkle seems to be that we'd need a name canonicalizer for every
language.  Of course we ought to have this anyway ... the whole idea
is really about changing the representation of the objects, not really
about changing anything fundamental.  Any problems exposed are problems
already, I think.
Comment 8 dje 2014-11-15 00:07:51 UTC
Data point, from PR 17604.
In my monster benchmark, 12 of 13 seconds of gdb startup time (the time to get to the prompt) is due to reading ELF symbols.
If I comment out the generation and storage of demangled copies of ELF symbols, then startup time reduces to 2.5 seconds.
This is enough to make me seriously want to implement something along these lines.
Comment 10 Tom Tromey 2020-03-25 19:24:42 UTC
I have another try at this, much simpler than what's outlined above.
Comment 11 Sourceware Commits 2020-04-24 21:36:03 UTC
The master branch has been updated by Tom Tromey <tromey@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=bcfe6157ca288efed127c5efe21ad7924e0d98cf

commit bcfe6157ca288efed127c5efe21ad7924e0d98cf
Author: Tom Tromey <tom@tromey.com>
Date:   Fri Apr 24 15:35:01 2020 -0600

    Use the linkage name if it exists
    
    The DWARF reader has had some odd code since the "physname" patches landed.
    
    In particular, these patches caused PR symtab/12707; namely, they made
    it so "set print demangle off" no longer works.
    
    This patch attempts to fix the problem.  It arranges to store the
    linkage name on the symbol if it exists, and it changes the DWARF
    reader so that the demangled name is no longer (usually) stored in the
    symbol's "linkage name" field.
    
    c-linkage-name.exp needed a tweak, because it started working
    correctly.  This conforms to what I think ought to happen, so this
    seems like an improvement here.
    
    compile-object-load.c needed a small change to use
    symbol_matches_search_name rather than directly examining the linkage
    name.  Looking directly at the name does the wrong thing for C++.
    
    There is still some name-related confusion in the DWARF reader:
    
    * "physname" often refers to the logical name and not what I would
      consider to be the "physical" name;
    
    * dwarf2_full_name, dwarf2_name, and dwarf2_physname all exist and
      return different strings -- but this seems like at least one name
      too many.  For example, Fortran requires dwarf2_full_name, but other
      languages do not.
    
    * To my surprise, dwarf2_physname prefers the form emitted by the
      demangler over the one that it computes.  This seems backward to me,
      given that the partial symbol reader prefers the opposite, and it
      seems to me that this choice may perform better as well.
    
    I didn't attempt to clean up these things.  It would be good to do,
    but whenever I contemplate it I get caught up in dreams of truly
    rewriting the DWARF reader instead.
    
    gdb/ChangeLog
    2020-04-24  Tom Tromey  <tom@tromey.com>
    
            PR symtab/12707:
            * dwarf2/read.c (add_partial_symbol): Use the linkage name if it
            exists.
            (new_symbol): Likewise.
            * compile/compile-object-load.c (get_out_value_type): Use
            symbol_matches_search_name.
    
    gdb/testsuite/ChangeLog
    2020-04-24  Tom Tromey  <tom@tromey.com>
    
            PR symtab/12707:
            * gdb.python/py-symbol.exp: Update expected results for
            linkage_name test.
            * gdb.cp/print-demangle.exp: New file.
            * gdb.base/c-linkage-name.exp: Fix test.
            * gdb.guile/scm-symbol.exp: Update expected results for
            linkage_name test.
Comment 12 Tom Tromey 2020-04-24 21:39:50 UTC
Fixed.