Bug 12707 - physname regression: set print demangle off
Summary: physname regression: set print demangle off
Status: NEW
Alias: None
Product: gdb
Classification: Unclassified
Component: symtab (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 16066 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-04-27 14:57 IST by Jan Kratochvil
Modified: 2014-11-15 00:07 IST (History)
5 users (show)

See Also:
Host:
Target: x86_64-unknown-linux-gnu
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jan Kratochvil 2011-04-27 14:57:46 IST
void f () { } int main () { f (); }

gcc -g

(gdb) break f
Breakpoint 1 at 0x4004d8: file .../demangle.C, line 1.
(gdb) run

pre-physname:
(gdb) set print demangle off 
(gdb) frame
#0  _Z1fv () at .../demangle.C:1
    ^^^^^^^^

post-physname:
(gdb) set print demangle off 
(gdb) frame
#0  f () at .../demangle.C:1
    ^^^^

there is a regression by the patch (not verified this specific commit has
regressed it but I guess so):
        42284fdf9d8cdb20c8e833bdbdb2b56977fea525
        http://sourceware.org/ml/gdb-cvs/2010-03/msg00082.html
        dwarf2_physname patchset:
        [RFA] dwarf2_physname FINAL
        http://sourceware.org/ml/gdb-patches/2010-03/msg00220.html
Comment 1 Tom Tromey 2013-01-14 14:31:11 IST
I looked into this and I'm working on a patch.
The first problem is that the mangled name is no longer stored.
This is pretty easy to fix but then there is some
unrelated fallout.
Comment 2 Tom Tromey 2013-02-01 21:35:11 IST
The thread starts here:

http://sourceware.org/ml/gdb-patches/2013-01/msg00287.html

Jan is probably correct that we should store 3 forms
for the template symbols in question:

http://sourceware.org/ml/gdb-patches/2013-01/msg00381.html


A more complicated approach is possible, based on
checking the name-without-return-type, and then filtering
the results.  However this seems more difficult to get right.

A third idea would be to introduce a kind of alias symbol
that points to the "canonical" one.  This might help avoid
some data structures issues -- e.g., minsyms assume right
now there can only be 2 names for a symbol, as minsyms
only have 2 "next" pointers.
Comment 3 Tom Tromey 2013-02-08 13:35:01 IST
Another idea occurred to me - always store just the
mangled form in the symbol, and have a table mapping demangled form(s) to the
mangled form.
Then when printing a symbol, demangle on demand, if needed.
Comment 4 Jan Kratochvil 2013-10-19 18:38:12 IST
*** Bug 16066 has been marked as a duplicate of this bug. ***
Comment 5 dje 2013-10-19 19:40:04 IST
Bug 16066 is more about the python API issue.
Let's not drop that part.
Comment 6 Tom Tromey 2013-11-21 19:03:45 IST
I looked into my idea from comment #3 a bit.

It occurred to me that we could hack the demangler a bit to return
both the full demangled name and a pointer to the part just after the
return type.

Then we could enter both of these strings into a hash table mapping
from the demangled forms back to the mangled name.

Symbol lookup would proceed by first looking up a name in this map,
and, if found, using the mangled form as the search key.

This would let us remove the demangled hash entirely from minimal
symbols.

To get the reverse direction to be efficient we could make a second
hash table, mapping the mangled form to a canonical demangled form.


It all seems doable, maybe not even too hard.  And it has some
benefits for users.  However I hesitated to follow through because I
am concerned it might be taking the symbol tables in the wrong
direction.
Comment 7 Tom Tromey 2014-06-13 19:34:57 IST
(In reply to Tom Tromey from comment #6)
> I looked into my idea from comment #3 a bit.

> Symbol lookup would proceed by first looking up a name in this map,
> and, if found, using the mangled form as the search key.

For completion we can iterate over the names in the hash table.

> This would let us remove the demangled hash entirely from minimal
> symbols.
> 
> To get the reverse direction to be efficient we could make a second
> hash table, mapping the mangled form to a canonical demangled form.

If we needed this direction (not clear) then we could just re-run
the demangler anyway.  Or we can do like the current code, where the
name in the symtab actually points to an object in the hash.

> It all seems doable, maybe not even too hard.  And it has some
> benefits for users.  However I hesitated to follow through because I
> am concerned it might be taking the symbol tables in the wrong
> direction.

I'm not as concerned about this any more.

One wrinkle seems to be that we'd need a name canonicalizer for every
language.  Of course we ought to have this anyway ... the whole idea
is really about changing the representation of the objects, not really
about changing anything fundamental.  Any problems exposed are problems
already, I think.
Comment 8 dje 2014-11-15 00:07:51 IST
Data point, from PR 17604.
In my monster benchmark, 12 of 13 seconds of gdb startup time (the time to get to the prompt) is due to reading ELF symbols.
If I comment out the generation and storage of demangled copies of ELF symbols, then startup time reduces to 2.5 seconds.
This is enough to make me seriously want to implement something along these lines.