This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Regression handling linkage vs source name


On 02/22/2012 08:40 AM, Doug Evans wrote:
On Tue, Feb 21, 2012 at 4:41 PM, Michael Eager<eager@eagercon.com> wrote:
There were changes to dwarf2_physname() around July 1, 2011, which
cause it to return a different value.  Arguably, the changes make
the function work better, returning the linkage name for a symbol where
it previously returned the source name.

Tangential data point: Outside of dwarf2read.c, gdb generally uses "physname" to mean linkage name.

It doesn't appear to be the case for the value returned by dwarf2_compute_name(). I haven't looked at other places.

I'm not sure dwarf2_physname ever returned the linkage name (for c/c++).
I know there's been some changes in how it computes the source name.

Before the patch, it didn't for C. I didn't look at C++.


But since the source name
is overwritten by the linkage name in the symbol entry, gdb's
behavior is different, and, IMO, incorrect.

Can you elaborate on what you mean by overwritten?

In new_symbol_full(): /* Cache this symbol's name and the name's demangled form (if any). */ SYMBOL_SET_LANGUAGE (sym, cu->language); linkagename = dwarf2_physname (name, die, cu); SYMBOL_SET_NAMES (sym, linkagename, strlen (linkagename), 0, objfile);

Clearly the comment is incorrect or misleading:  only one name is stored in
the symbol.

Before the patch, dwarf2_physname() would call dwarf2_compute_name() which
would return 'name' unmodified.

After the patch, dwarf2_physname() returns the linkage name or a mangled
name.

(I initially misread the code to save the source name in the symbol,
then overwrite it with the linkage name.)

[It's not what I see.]

Here is the test case:

$ cat t.c
extern int foo(void) asm ("xfoo");

int foo(void) {}

int main (void)
{
  foo ();
}

$ gdb-before a.out
...
(gdb) b foo
Breakpoint 1 at ...

$ gdb-after a.out
...
(gdb) b foo
Make breakpoint pending on future shared library load? (y or [n])? n
(gdb) b xfoo
Breakpoint 1 at ...


The symbol "foo" is no longer present in gdb symbol table so trying to use the name specified in the source fails. Before the change, breakpoint, backtrace, and other commands display the symbol name as in the source (and in the DWARF DIE). After the change, the linkage name is displayed instead of the source name.

Even pre-dwarf2_physname gdb's have this problem. Maybe you tried a 7.x variant that I didn't (I think I tried 6.8, 7.0, 7.1, and 7.4 - at the time I didn't have easy access to 7.2,7.3 and didn't want to go to trouble of building them).

This works correctly in 7.2. Maybe this was an "accident". :-)


The problem (or at least one of the problems), as I see it, is that
the API for creating symbols is broken.

I agree.


gdb does (or at least can) record both the linkage and source names
(it does this for "minsyms", e.g. ELF symbols).  But the way to set a
symbol's name is via symbol_set_names and it only takes a linkage name
(though the dwarf2read.c further compounds the problem by passing it
the source name - or more accurately the "search" name).
symbol_set_names then tries to demangle the name and will record that
name as well if the demangling succeeds.

Well, the minisym only stores a single name, not both linkage and source names. Despite what the comments preceding symbol_set_names() says, there's only one name saved.

The code in symbol_set_names() seems to be doing way more than it needs
to.  If both the source and linkage names are available, then there
is no need to mangle, demangle, add a prefix or do anything but save
the name.

(symbol_set_demangled_name() does save a demangled name for C++.)

Before the change, dwarf2_physname() calls dwarf2_compute_name()
which returns the symbol name if the language is not C++, Java,
or Fortran, even if the DIE has a DW_AT_linkage_name attribute.
After the change, dwarf2_physname() returns the DW_AT_linkage_name.

Since gdb doesn't keep both the source name and the linkage
name in the symbol table (a design flaw, IMO) what is the
right way to get the previous behavior, where gdb recognizes
the symbol names from the source file?

We need to have dwarf2read.c store both names (linkage name and dwarf name). [More changes may be required beyond that, but I think that's a start.] Your test program shows that we can't assume we can simply demangle the linkage name to get the source name.

Yes, this was my conclusion as well.


When debug info is not available, mangling a source name or demangling
a linkage name may be necessary.  I don't know that this should be done
when reading the DWARF info.  When the debug info contains both names,
or the linkage name can be inferred from the DIE, doing anything
other than saving their values should not be necessary.

Understanding what the symbol routines are trying to do with all of
the mangling and demangling is a challenge.   One thing which is not
clear is whether there is agreement whether the name stored for a symbol
is supposed to be the source name or the linkage name.  symbol_set_names()
and symbol_set_demangled_name() suggest that symtab.c thinks it is the
linkage name.  Commands which look up a symbol assume that it is the
source name.

Should there be two symbol entries, one for "foo" and one for "xfoo"?
A single entry which is searched for a match on either name or linkage
name?  Something else?

--
Michael Eager	 eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]