The future of dwarf2_physname

Jan Kratochvil jan.kratochvil@redhat.com
Mon May 23 13:17:00 GMT 2011


On Thu, 19 May 2011 00:34:43 +0200, Keith Seitz wrote:
> I guess there are basically two options:
> 1) Keep physname & apply the 12266/12506 patchset (when approved)
> 2) Adopt Jan's patch and fix the fallout now
> 
> My take on this today is that I think the least risky/most timely
> solution is #1.

#1 is still IMO a lot of work (see "Computed physname" errors at the end of
this mail) and #1 should always be backed by #2 anyway and as #2 with current
GCCs has the same user experience as #1 I do not see a reason why to wait on #1
for gdb-7.3.

I have concerns about regressions for gdb-7.3, I would like to fix regressions
gdb-7.1 (pre-physname) -> gdb-7.2 (physname) but possibly not introduce
regressions gdb-7.2 (physname) -> gdb-7.3 (?).  Trying to propose a simple
DW_AT_linkage_name patch of mine but I do not yet have it finished.


problem #1 of post-physname GDB

(gdb) p 'f<char>()' 
$1 = {long (void)} 0x4004e4 <f<char>()>
(gdb) p 'long f<char>()' 
$2 = {<text variable, no debug info>} 0x4004e4 <f<char>()>
(gdb) p 'f<short>()' 
No symbol "f<short>()" in current context.
(gdb) p 'long f<short>()' 
$3 = {<text variable, no debug info>} 0x4004fb <long f<short>()>

==> dis.h <==
template <typename T>
long f () { return 1; }
==> dis1.C <==
#include "dis.h"
int main () { f<char> (); }
==> dis2.C <==
#include "dis.h"
void g () { f<short> (); }

The two different aliases for 0x4004e4 where the $2 one does not have any type
is a bug, at least both should have the same type.  It can be seen also on the
f<short>() case ($3) where the return type missing alias is not present because
only ELF (and not DWARF) symbol is present.


I haven't found how to reproduce the regression
"Cannot take address of method m." for function templates (which use the
return type in their linkage) as I haven't found how to reference function
template externally from a class - this field is missing there in such case:
class C { public:
  template <typename T>
  static long f ();
};
And I get
	error: explicit specialization in non-namespace scope ‘class C’
for
class C { public:
  template <typename T>
  static long f ();
  template <>
  static long f<char> ();
};
And in other cases there is alread the template function definition with
associated DWARF which gets processed by GDB-physname and the return type gets
stripped.


problem #2 of post-physname GDB

(gdb) p 'int f<long>()' 
$1 = {<text variable, no debug info>} 0x40050f <f<long>()>
(gdb) p 'short f<long>()' 
$2 = {<text variable, no debug info>} 0x400531 <f<long>()>
(gdb) p 'f<long>()' 
$3 = {int (void)} 0x40050f <f<long>()>

==> manya.C <==
template <typename T>
int f () { return 4; }
int a () { return f<long> () == 4; }
extern int b ();
int main () { return a () && b () ? 0 : 1; }
==> manyb.C <==
template <typename T>
short f () { return 2; }
int b () { return f<long> () == 2; }

As the return type is stripped the name can become ambiguous.  Different
template functions with same name, same parameter types but different return
type can be linked in a single program.  It is true one cannot declare both
such template functions in a single CU (Compilation Unit) so I do not expect
such case is found in real world.


I was trying to propose DMGL_RET_POSTFIX but it has other drawbacks and it is
in fact a new feature (and not a regression fixup) because the trailing type
was not present in gdb-7.2 anyway.  So it can be considered after some tidy up
for gdb-7.4 but it makes no sense for gdb-7.3.
	Re: [rfc] physname cross-check #2
	http://sourceware.org/ml/gdb-patches/2011-05/msg00466.html
Therefore it would apply for both ELF and both DWARF symbols producing the same
form.  For ELF symbols implemented by (patched) DMGL_RET_POSTFIX and for DWARF
symbols missing DW_AT_linkage_name implemented by GDB-physname code.

That is the 'int f<long>()' and 'short f<long>()' would be uniquely searchable
as `f<long>()int' and `f<long>()short'.  Still `nm -C', `objdump -C' normally
do not (and even cannot) request DMGL_RET_POSTFIX therefore to be able to
`break symbol_that_I_see_in_objdump_C' these DMGL_RET_POSTFIX-generated
symbols should be only as aliases to also available non-DMGL_RET_POSTFIX
symbols.


Then there is the code for NAME_KIND_PRINT (with typedefs etc. preserved,
different from NAME_KIND_PHYS with typedefs etc. resolved).  That may be a nice
new feature for gdb-7.4 but as such I do not find it relevant for gdb-7.3.
cp_canonicalize_string_no_typedefs in decode_variable is also a nice feature
(`break func(typedef)') but it is a new feature not present anytime before,
therefore for gdb-7.4.

Also the current NAME_KIND_PRINT behavior does not seem clearly the right one:
	typedef int t; void f(t p) {} int main () {}
(gdb) p f
$1 = {void (t)} 0x4004d4 <f(t)>
 - correct, in this CU `f' has the NAME_KIND_PRINT `t'.
(gdb) info sym 0x4004d4
f(int) in section .text
 - correct - `info symbol' uses only minimal=ELF symbols
(gdb) complete p 'f(
p 'f(int)
p 'f(t)
 - I do not find it clear this is right.  There should be definitely the
   `f(int)' - demangled linkage name - variant.  But whether to show the
   "convenient" aliases as SYMBOL_SEARCH_NAME I do not find clear, it may be
   better but it also complicates the list of all the available overloads for
   a function name, one should ask practical C++ programmer for it IMO.
   After all there is still only a single function `f':
(gdb) p f(t)
$2 = {void (t)} 0x4004d4 <f(t)>
(gdb) p f(int)
$3 = {void (t)} 0x4004d4 <f(t)>


Regarding the proposals to fix dwarf2_physname computation instead of
preferring DW_AT_linkage_name.  ISTM we have never found the agreement with
Keith, when DW_AT_linkage_name is present it is guaranteed to be correct and
I do not know why not to use it.  It is a good verification way to fix all the
existing dwarf2_physname computation issues so that in the future one may stop
shipping the redundant size-costly (3-10% of .debug files) DW_AT_linkage_name
strings.  But still even in future I do not see why GDB should not prefer
DW_AT_linkage_name over the dwarf2_physname computated name, when it is
present.

Preferring dwarf2_physname computation over DW_AT_linkage_name - when they
differ - means a failure to match the ELF demangled symbols leading to the
error "Cannot take address of method m." and/or missing function types etc.
So that way I find as always wrong.

Keith proposes to fix dwarf2_physname instead of using DW_AT_linkage_name.
If it really always matches sure there is no difference although (a) it still
may take some time for gdb-7.3 and (b) even in the future I do not see where
is any advantage over preferring DW_AT_linkage_name when it is available.

Bugs to fix for /usr/lib/debug/usr/lib64/libwebkitgtk-1.0.so.0.7.0.debug
from webkitgtk-debuginfo-1.4.0-1.fc15.x86_64 - it is definitely not
a complete bug list, raw messages list has 56490 lines:
http://people.redhat.com/jkratoch/webkitgtk-debuginfo-1.4.0-1.fc15.x86_64-gdb.err.xz
Computed physname <JSC::JSCell::markChildren(struct JSC::MarkStack &)> does not match demangled <JSC::JSCell::markChildren(JSC::MarkStack&)> (from linkage <_ZN3JSC6JSCell12markChildrenERNS_9MarkStackE>)
 - you reported the `struct' bug is already fixed in a patch being prepared
Computed physname <WebCore::lineCapName(enum WebCore::LineCap)> does not match demangled <WebCore::lineCapName(WebCore::LineCap)> (from linkage <_ZN7WebCore11lineCapNameENS_7LineCapE>)
 - IIRC you reported the `enum' bug is also already fixed
Computed physname <WTF::FixedArray<double, 8u>::data()> does not match demangled <WTF::FixedArray<double, 8ul>::data()> (from linkage <_ZN3WTF10FixedArrayIdLm8EE4dataEv>)
 - FYI its DW_TAG_class_type DW_AT_name is: FixedArray<double, 8u>
Computed physname <WTF::PtrHash<void const*>::hash(void*)> does not match demangled <WTF::PtrHash<void const*>::hash(void const*)> (from linkage <_ZN3WTF7PtrHashIPKvE4hashES2_>)
 - const issue
Computed physname <WebCore::RangeBoundaryPoint::offset()> does not match demangled <WebCore::RangeBoundaryPoint::offset() const> (from linkage <_ZNK7WebCore18RangeBoundaryPoint6offsetEv>)
 - maybe different const issue
Computed physname <WebCore::IntPoint::operator GdkPoint() const> does not match demangled <WebCore::IntPoint::operator _GdkPoint() const> (from linkage <_ZNK7WebCore8IntPointcv9_GdkPointEv>)
 - some leading _
Computed physname <JSC::JSParser::parseBreakStatement<JSC::ASTBuilder>(JSC::ASTBuilder&)> does not match demangled <JSC::ASTBuilder::Statement JSC::JSParser::parseBreakStatement<JSC::ASTBuilder>(JSC::ASTBuilder&)> (from linkage <_ZN3JSC8JSParser19parseBreakStatementINS_10ASTBuilderEEENT_9StatementERS3_>)
 - missing return type
etc.


After all I still do not have a patch at hand but I gave some thoughts of mine
on it.


Thanks,
Jan



More information about the Gdb-patches mailing list