Bug 16253 - Cannot print an enum var with the same name as tag
Summary: Cannot print an enum var with the same name as tag
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: c++ (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: 7.8
Assignee: Keith Seitz
URL:
Keywords:
: 7737 (view as bug list)
Depends on:
Blocks: 16106
  Show dependency treegraph
 
Reported: 2013-11-26 14:24 IST by Marek Polacek
Modified: 2016-05-16 18:57 IST (History)
9 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marek Polacek 2013-11-26 14:24:30 IST
On:
enum e { A, B } e;
int main (void) { }

I can't print the value of 'e':
$ g++ -g z.C; gdb -q -batch -ex 'sta' -ex 'p e' ./a.out

I have GNU gdb (GDB) 7.6.50.20131125-cvs and gcc version 4.9.0 20131126 (experimental) (GCC).

Note that with cc1 rather than cc1plus this works fine:
2	int main (void) { }
$1 = A
Comment 1 Marek Polacek 2013-11-26 14:27:17 IST
Forgot to mention that with cc1plus I get 'Attempt to use a type name as an expression' rather than '$1 = A'.
Comment 2 Keith Seitz 2013-11-27 18:52:40 IST
This is happening because of the function symbol_matches_domain, which was introduced by:

commit 5eeb2539423ce5e7241bce403f48b8babb3670b0
Author: Aleksandar Ristovski <aristovski@qnx.com>
Date:   Mon May 5 14:37:32 2008 +0000

        * ada-lang.c: Update throughout to use symbol_matches_domain
        instead of matching the symbol domain explictly.
        * dwarf2read.c (add_partial_symbol): Do not add new psym for
        STRUCT_DOMAIN. Make sure you recognize c++ struct and java and ada
        class as typedefs. See lookup_partial_symbol function.
        (new_symbol): Similar to add_partial_symbol, do not create
        symbol for the typedef. See lookup_block_symbol.
        * symtab.c (symbol_matches_domain): New function, takes care
        of dual meaning of STRUCT_DOMAIN symbol for c++, ada and java.
        (lookup_partial_symbol): Use symbol_matches_domain to see if the
        found psym domain matches the given domain.
        (lookup_block_symbol): Likewise.

The parsers all lookup names from the input using searches in VAR_DOMAIN. However, because of the above patch, any symbol in STRUCT_DOMAIN will also match. As a result, lookup_block_symbol returns the type named "e" instead of the variable named "e".

From the comments in this patch (there is no public discussion that I could find), I believe I understand why it was introduced, but I don't think this solution was appropriate. I would have thought it would be easier (and more correct) to either add an explicit typedef during symbol reading or search VAR_DOMAIN and STRUCT_DOMAIN where appropriate. The only real solution is the later, since the former will cause name collisions in VAR_DOMAIN.

As it is, I do not think this non-determinant behavior is correct. This will randomly "hide" variables and types of the same name (depending on where they are in the symbol table).

There are 183 places where VAR_DOMAIN is mentioned in the source. All need to be checked. This may take a fair amount of time. If you despearately need a workaround for c/c++/java (all DWARF), I can send you the work-in-progress patch for c/c++/java (no test suite regressions). Send me a private email.
Comment 3 Marek Polacek 2013-11-27 23:48:00 IST
Thanks for looking into it.  I don't need a workaround "desperately", though it'd be nice to get it fixed sometime -- I hit that from time to time when debugging GCC, which is C++ for some time now.
Comment 4 Keith Seitz 2014-03-21 18:13:59 IST
Patch submitted:
https://sourceware.org/ml/gdb-patches/2014-03/msg00540.html
Comment 5 cvs-commit@gcc.gnu.org 2014-04-14 22:56:50 IST
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "gdb and binutils".

The branch, master has been updated
       via  b50c861487bb7d71185777193a9246bac81e4f26 (commit)
      from  3d567982aca11c85a7fa31f13046de3271d3afc8 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=b50c861487bb7d71185777193a9246bac81e4f26

commit b50c861487bb7d71185777193a9246bac81e4f26
Author: Keith Seitz <keiths@redhat.com>
Date:   Mon Apr 14 15:47:15 2014 -0700

    Remove symbol_matches_domain. This fixes
    PR c++/16253.
    
    symbol_matches_domain was permitting searches for a VAR_DOMAIN
    symbol to also match STRUCT_DOMAIN symbols for languages like C++
    where STRUCT_DOMAIN symbols also define a typedef of the same name,
    e.g., "struct foo {}" introduces a typedef of the name "foo".
    
    Problems occur if there exists both a VAR_DOMAIN and STRUCT_DOMAIN
    symbol of the same name. Then it is essentially a race between which
    symbol is found first. The other symbol is obscurred.
    [This is a relatively common idiom: enum e { ... } e;]
    
    This patchset moves this "language defines a typedef" logic to
    lookup_symbol[_in_language], looking first for a symbol in the given
    domain and falling back to searching STRUCT_DOMAIN when/if appropriate.
    
    2014-04-14  Keith Seitz  <keiths@redhat.com>
    
    	PR c++/16253
    	* ada-lang.c (ada_symbol_matches_domain): Moved here and renamed
    	from symbol_matches_domain in symtab.c. All local callers
    	of symbol_matches_domain updated.
    	(standard_lookup): If DOMAIN is VAR_DOMAIN and no symbol is found,
    	search STRUCT_DOMAIN.
    	(ada_find_any_type_symbol): Do not search STRUCT_DOMAIN
    	independently.  standard_lookup will do that automatically.
    	* cp-namespace.c (cp_lookup_symbol_nonlocal): Explain when/why
    	VAR_DOMAIN searches may return a STRUCT_DOMAIN match.
    	(cp_lookup_symbol_in_namespace): Likewise.
    	If no VAR_DOMAIN symbol is found, search STRUCT_DOMAIN.
    	(cp_lookup_symbol_exports): Explain when/why VAR_DOMAIN searches
    	may return a STRUCT_DOMAIN match.
    	(lookup_symbol_file): Search for the class name in STRUCT_DOMAIN.
    	* cp-support.c: Include language.h.
    	(inspect_type): Explicitly search STRUCT_DOMAIN before searching
    	VAR_DOMAIN.
    	* psymtab.c (match_partial_symbol): Compare the requested
    	domain with the symbol's domain directly.
    	(lookup_partial_symbol): Likewise.
    	* symtab.c (lookup_symbol_in_language): Explain when/why
    	VAR_DOMAIN searches may return a STRUCT_DOMAIN match.
    	If no VAR_DOMAIN symbol is found, search STRUCT_DOMAIN for
    	appropriate languages.
    	(symbol_matches_domain): Renamed `ada_symbol_matches_domain'
    	and moved to ada-lang.c
    	(lookup_block_symbol): Explain that this function only returns
    	symbol matching the requested DOMAIN.
    	Compare the requested domain with the symbol's domain directly.
    	(iterate_over_symbols): Compare the requested domain with the
    	symbol's domain directly.
    	* symtab.h (symbol_matches_domain): Remove.
    
    2014-04-14  Keith Seitz  <keiths@redhat.com>
    
    	PR c++/16253
    	* gdb.cp/var-tag.cc: New file.
    	* gdb.cp/var-tag.exp: New file.
    	* gdb.dwarf2/dw2-ada-ffffffff.exp: Set the language to C++.
    	* gdb.dwarf2/dw2-anon-mptr.exp: Likewise.
    	* gdb.dwarf2/dw2-double-set-die-type.exp: Likewise.
    	* gdb.dwarf2/dw2-inheritance.exp: Likewise.

-----------------------------------------------------------------------

Summary of changes:
 gdb/ChangeLog                                      |   36 +++++++
 gdb/ada-lang.c                                     |   65 ++++++++++---
 gdb/cp-namespace.c                                 |   39 +++++++--
 gdb/cp-support.c                                   |   25 +++++-
 gdb/psymtab.c                                      |   15 +--
 gdb/symtab.c                                       |   60 ++++++------
 gdb/symtab.h                                       |    4 -
 gdb/testsuite/ChangeLog                            |   10 ++
 gdb/testsuite/gdb.cp/var-tag.cc                    |   44 +++++++++
 gdb/testsuite/gdb.cp/var-tag.exp                   |   99 ++++++++++++++++++++
 gdb/testsuite/gdb.dwarf2/dw2-ada-ffffffff.exp      |    4 +
 gdb/testsuite/gdb.dwarf2/dw2-anon-mptr.exp         |    2 +
 .../gdb.dwarf2/dw2-double-set-die-type.exp         |    1 +
 gdb/testsuite/gdb.dwarf2/dw2-inheritance.exp       |    1 +
 14 files changed, 339 insertions(+), 66 deletions(-)
 create mode 100644 gdb/testsuite/gdb.cp/var-tag.cc
 create mode 100644 gdb/testsuite/gdb.cp/var-tag.exp
Comment 6 Keith Seitz 2014-04-14 22:57:38 IST
Patch committed.
Comment 7 dje 2014-05-28 01:50:55 IST
Alas the patch introduces a massive perf regression in one of my perf tests.

Using my standard monster testcase, doing "info fun ^foo::(anonymous namespace)" goes from about a minute to an untold number of minutes (expected to be in the thousands :-)).

.gdb_index has this for int64:

[1476052] int64:
        2 [static, type]
        4 [static, type]
        5 [static, type]
        ...

There are 10K such entries.

The output of "info fun ..." includes "int64" so cp_canonicalize_string_full ("int64") is called.

cp-support.c:inspect_type first tries to look up the symbol in STRUCT_DOMAIN, and then in VAR_DOMAIN.  The problem is that there is insufficient info in the index to know a lookup in STRUCT_DOMAIN will fail, so dw2_symtab_iter_next will blindly return every entry in the index, which will cause every CU to be expanded.  Later, when we do the lookup in the symtab, the domains won't match (int64 is in VAR_DOMAIN) so the lookup will fail and we'll try the next entry in the index ...

Later, inspect_type will try VAR_DOMAIN:

    if (current_language->la_language == language_cplus)
      sym = lookup_symbol (name, 0, STRUCT_DOMAIN, 0);
    if (sym == NULL)
      sym = lookup_symbol (name, 0, VAR_DOMAIN, NULL);

but that's only after gdb has expanded the debug info for 10K CUs.
Ugh.

One solution is to encode the STRUCT_DOMAIN/VAR_DOMAIN attribute in the index.
We should first see what's involved in generating this, and whether gold can.

Another solution would be for symbol lookup to stop iterating over every CU looking for static symbols (or at least not do that until the very VERY end).
If I'm in CU foo.o I should expect to not get the definition of "static" symbol baz in bar.o.  This also ties in with the currently broken lookup of "int": we should search the current CU, then the arch defaults, then maybe as a last resort every CU.  The performance cost of these "last resort" lookups on big programs has soured me on doing that by default though.

In the case of "info fun ..." where cp_canonicalize_string_full isn't passed a CU (so how can it know which CU to look in for static symbols), we need to start passing it one (and similarly throughout gdb).

I'm going to check if there's an existing PR for this perf problem, and open a new one if not.

I think this is a blocker for 7.8.
Comment 8 dje 2014-05-28 01:53:59 IST
Oh, for clarity's sake, I'm not saying the entire perf issue has to be fixed for 7.8.  I'm not suggesting doing sweeping changes to gdb's symbol lookup before we branch.  Rather, I think this particular perf issue needs to be fixed.
Comment 9 dje 2014-05-28 17:25:27 IST
I filed pr 16994 to handle the general issue of searching in STATIC_BLOCK problems.
Comment 10 Doug Evans 2014-06-04 03:32:09 IST
I've spent some time looking at this, I couldn't think of anything I'm happy with.  Our STRUCT_DOMAIN vs VAR_DOMAIN approach has issues, and I'm not sure how best to resolve them.

I propose reverting this patch for 7.8, and then revisit after 7.8 has branched.
Comment 11 Luis Machado 2014-08-29 13:37:43 IST
FYI, i've hit this when debugging cc1plus. "lang_hooks" has exactly the same problem and GDB happily picks up a typedef symbol when it should have picked a variable. Very annoying.
Comment 12 Frank Ch. Eigler 2015-06-03 15:36:39 IST
(In reply to dje from comment #7)
> Alas the patch introduces a massive perf regression in one of my perf tests.
> 
> Using my standard monster testcase, doing "info fun ^foo::(anonymous
> namespace)" goes from about a minute to an untold number of minutes
> (expected to be in the thousands :-)).

While performance on a synthetic monster test case is of some importance, user experience with everyday normal programs is of at least that much.
Comment 13 Jason Merrill 2015-06-03 16:09:07 IST
*** Bug 7737 has been marked as a duplicate of this bug. ***
Comment 14 dje 2015-06-03 16:15:09 IST
(In reply to Frank Ch. Eigler from comment #12)
> (In reply to dje from comment #7)
> > Alas the patch introduces a massive perf regression in one of my perf tests.
> > 
> > Using my standard monster testcase, doing "info fun ^foo::(anonymous
> > namespace)" goes from about a minute to an untold number of minutes
> > (expected to be in the thousands :-)).
> 
> While performance on a synthetic monster test case is of some importance,
> user experience with everyday normal programs is of at least that much.

You should first check whether my "standard monster testcase" is all that artificial/synthetic. :-)
It's as real a program as it gets here (and way cool).
Comment 15 dje 2015-06-03 17:32:44 IST
IWBN to look into whether a STRUCT/VAR_DOMAIN indicator can be added to the index. The problem, I fear, will be getting gcc+gold to generate it.
[Which is an important problem to solve as the official DWARF index (in DWARFv6?) may have the same problem ... and it's getting late in the day ... do we need to make changes to the official index?]
Comment 16 dje 2015-06-04 17:29:26 IST
(In reply to dje from comment #15)
> IWBN to look into whether a STRUCT/VAR_DOMAIN indicator can be added to the
> index. The problem, I fear, will be getting gcc+gold to generate it.
> [Which is an important problem to solve as the official DWARF index (in
> DWARFv6?) may have the same problem ... and it's getting late in the day ...
> do we need to make changes to the official index?]

For reference sake,
The index going into DWARF includes the symbol's tag (DW_TAG_typedef, etc.).
Presumably this is enough to avoid the perf issue.
It would still be nice to see a gdb implementation of this proposal sooner than later.
Comment 17 Keith Seitz 2015-06-12 15:53:02 IST
FWIW, I've proposed (for discussion) an alternate approach to the problem. This is a *much* less invasive/safer patch than the original, but it should get most people going without impacting performance.

See: https://sourceware.org/ml/gdb-patches/2015-06/msg00230.html .
Comment 18 cvs-commit@gcc.gnu.org 2015-06-26 18:43:12 IST
The master branch has been updated by Keith Seitz <kseitz@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=ee93cd5e1e61e5739a1a44e0d1d166ae09d04dc2

commit ee93cd5e1e61e5739a1a44e0d1d166ae09d04dc2
Author: Keith Seitz <keiths@redhat.com>
Date:   Fri Jun 26 10:27:45 2015 -0700

    PR 16253 revisited
    
    Last year a patch was submitted/approved/commited to eliminate
    symbol_matches_domain which was causing this problem.  It was later reverted
    because it introduced a (severe) performance regression.
    
    Recap:
    
    (gdb) list
    1	enum e {A,B,C} e;
    2	int main (void) { return 0; }
    3
    (gdb) p e
    Attempt to use a type name as an expression
    
    The parser attempts to find a symbol named "e" of VAR_DOMAIN.
    This gets passed down through lookup_symbol and (eventually) into
    block_lookup_symbol_primary, which iterates over the block's dictionary
    of symbols:
    
      for (sym = dict_iter_name_first (block->dict, name, &dict_iter);
           sym != NULL;
           sym = dict_iter_name_next (name, &dict_iter))
        {
          if (symbol_matches_domain (SYMBOL_LANGUAGE (sym),
                                     SYMBOL_DOMAIN (sym), domain))
            return sym;
        }
    
    The problem here is that we have a symbol named "e" in both STRUCT_DOMAIN
    and VAR_DOMAIN, and for languages like C++, Java, and Ada, where a tag name
    may be used as an implicit typedef of the type, symbol_matches_domain ignores
    the difference between VAR_DOMAIN and STRUCT_DOMAIN.  As it happens, the
    STRUCT_DOMAIN symbol is found first, considered a match, and that symbol is
    returned to the parser, eliciting the (now dreaded) error message.
    
    Since this bug exists specifically because we have both STRUCT and VAR_DOMAIN
    symbols in a given block/CU, this patch rather simply/naively changes
    block_lookup_symbol_primary so that it continues to search for an exact
    domain match on the symbol if symbol_matches_domain returns a symbol
    which does not exactly match the requested domain.
    
    This "fixes" the immediate problem, but admittedly might uncover other,
    related bugs.  [Paranoia?] However, it causes no regressions (functional
    or performance) in the test suite.  A similar change has been made
    to block_lookup_symbol for other cases in which this bug might appear.
    
    The tests from the previous submission have been resurrected and updated.
    However since we can still be given a matching symbol with a different domain
    than requested, we cannot say that a symbol "was not found."  The error
    messages today will still be the (dreaded) "Attempt to use a type name..."
    
    ChangeLog
    
    	PR 16253
    	* block.c (block_lookup_symbol): For non-function blocks,
    	continue to search for a symbol with an exact domain match
    	Otherwise, return any previously found "best domain" symbol.
    	(block_lookup_symbol_primary): Likewise.
    
    testsuite/ChangeLog
    
    	PR 16253
    	* gdb.cp/var-tag-2.cc: New file.
    	* gdb.cp/var-tag-3.cc: New file.
    	* gdb.cp/var-tag-4.cc: New file.
    	* gdb.cp/var-tag.cc: New file.
    	* gdb.cp/var-tag.exp: New file.
Comment 19 Keith Seitz 2016-05-16 18:57:49 IST
Correct rogue reopening of (closed) bug.