This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[rfc] handle (de)mangled names consistently


Right now, GDB's handling of mangled and demangled names is
inconsistent, and it's causing problems.

Example: Some symbol-manipulation functions take only a demangled
name, some take either a demangled name or a mangled name, and some
only take a mangled name.  It's very hard to tell which is which.

Example: Currently, GDB has a bug involving C++ name lookup, that
arose when a function that previously accepted either demangled names
or mangled names was changed to only accept demangled names.  The bug
has been around for a while, but I certainly don't blame whoever
introduced it: the flow of control that gives rise to that bug is
buried within a series of function calls that gradually shift from
accepting mangled names to not accepting mangled names, with no clear
indication as to what's appropriate.  (This bit of code has also
caused HP problems in the past.)

Example: It's quite frequent for code to want to get at the demangled
name of a symbol, if there is one; this takes a little bit of work,
yet GDB has no macro or function to perform this work for you!

Example: Just last week, Daniel wanted to fix lookup_minimal_symbol to
handle one important case much more quickly.  Elena rightly complained
that his patch made the flow of control of that function still more
confusing; if the function didn't try to handle both mangled and
demangled names at once, then the whole stupid

        for (pass = 1; pass <= 2 && found_symbol == NULL; pass++)

complication could go away, making the flow of control at least a
little bit simpler.

Example: testsuite/gdb.c++/psmang.exp.  Read it and weep.  Or boggle.
Or something.


Basically, right now it's not clear what assumptions functions are
making.  This is a maintenance nightmare: it's impossible to clean up
up functions that take names of symbols, because you can't be sure if
their callers depend on the current behavior of those functions when
passed demangled names, mangled names, or both.

For purposes of this message, 'symbol' means 'symbol or minimal
symbol'.  If partial symbols get demangled (which may or may not
happen soon: Daniel's working on a patch that does it quite
efficiently), then it also means '... or partial symbol'; otherwise,
part of this plan will include making it clear that callers to partial
symbol functions don't use demangled names, only mangled names.


Here is what I propose to do about this mess:

1) Clean up the accessors for the various sorts of symbol names.

2) Make lookup functions only accept either mangled names or demangled
   names, but not both.  Make it explicit which they accept.

3) Write symbol-initialization functions which make it as easy as
   possible to initialize the mangled name and demangled name
   simultaneously.


Here are a few more random thoughts:

I want to avoid the terminology "mangled/demangled names", because
they don't make a lot of sense when applied to symbols in languages
that don't mangle names.  So, for now, let me refer to "natural names"
and "linkage names": for C++, the natural name is the demangled name,
and the linkage name the mangled name, and for other languages, the
natural name and linkage names are equal.  Probably the norm will be
for functions to accept natural names rather than linkage names;
functions that accept linkage names should feel free to only look for
symbols that correspond to actual objects.  (Global/static variables
and functions, basically.)

One first question is: what should the accessors be called?  Right
now, there are the following macros in symtab.h:

SYMBOL_NAME: returns the linkage name, more or less.

SYMBOL_DEMANGLED_NAME: returns the demangled name, if there is
one (i.e. if the natural name isn't the linkage name); otherwise,
returns NULL.  There are also variants of this that only work for
specific languages.

SYMBOL_SOURCE_NAME: returns the natural name, unless 'demangle' isn't
set, in which case it returns the linkage name.  This should only be
used in output routines, though currently it is occasionally used in
other circumstances.

SYMBOL_LINKAGE_NAME: same as SYMBOL_SOURCE_NAME, except that it tests
both 'demangle' and 'asm_demangle'.  Only used in two places in one
function.


What makes sense to me is:

1) Get rid of the current SYMBOL_LINKAGE_NAME entirely.  It's not used
   enough to justify existing as a macro.

2) Rename SYMBOL_SOURCE_NAME to SYMBOL_PRINT_NAME, to make it clear
   that it's only appropriate to use for output.

3) Add a new macro SYMBOL_LINKAGE_NAME that has the same definition as
   the current SYMBOL_NAME.  If you use SYMBOL_LINKAGE_NAME, you
   really mean "I want the linkage name" as opposed to "I haven't
   really thought about whether I want the linkage name or the natural
   name".  Given how little the current SYMBOL_LINKAGE_NAME is used, I
   don't think that anybody will be confused by adding a macro with
   the same name but with a different definition.

4) Create a new macro that gives the natural name.  The first options
   that come to mind are SYMBOL_NAME and SYMBOL_SOURCE_NAME, but both
   of those would be gratuitously confusing.  Other possibilities:
   SYMBOL_NATURAL_NAME (that's what I've been calling it in this
   e-mail, after all), or SYMBOL_SEARCH_NAME.

5) Examine all uses of SYMBOL_PRINT_NAME to see if they should be
   replaced by SYMBOL_NATURAL_NAME.

6) Eventually, try to go through uses of SYMBOL_NAME and replace them
   by SYMBOL_LINKAGE_NAME or SYMBOL_NATURAL_NAME.  

Once that's done, the accessors will be cleaned up.

I have thoughts on how to go about the other issues, too, but that's
enough for now.  I've implemented most of this on a branch, so I'm
sure that it's feasible.

Comments?  Suggestions for better names for the macros?

David Carlton
carlton@math.stanford.edu


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]