[RFC] partial symbol name matching vs regexp

Tom Tromey tromey@adacore.com
Thu Aug 8 19:33:00 GMT 2019

>>>>> "Joel" == Joel Brobecker <brobecker@adacore.com> writes:

Replying to email from October of 2018:


Joel> One of our users reported that info types was sometimes not working
Joel> for Ada types. Consider for instance the following code:
Joel>     package Alpha is
Joel>        type Beta is record
Joel>           B : Integer;
Joel>        end record;
Joel>     end Alpha;

Joel> With that assumption in mind, I think the way to make it work
Joel> is to change the symbol_matcher's signature to receive more
Joel> information - so that each caller of expand_symtabs_matching can
Joel> then decide which symbol name it needs to look at; in most cases,
Joel> it will be the symbol_search_name, but in the particular case of
Joel> search_symbols where we're comparing against a regexp that we assume
Joel> comes from the user, we can decide to use the symbol_natural_name
Joel> instead.

This makes sense to me, especially given what you found:

Joel> This would actually be consistent with the rest of search_symbols's
Joel> implementation; as you can see, once the partial symbol expansion
Joel> is performed, it iterates over minimal symbols and full symbols,
Joel> and selects them based on the symbol's natural name as well:
Joel>     | ALL_COMPUNITS (objfile, cust)
Joel>     | [...]
Joel>     |      && ((!preg
Joel>     |           || preg->exec (SYMBOL_NATURAL_NAME (sym), 0,
Joel>     |                          NULL, 0) == 0)

Having the expansion step using one symbol name and then the actual
search use another seems like a bug.

Joel> It allows me to get identical results for any language but Ada,
Joel> knowing that it doesn't interfere with my Ada testing, since we
Joel> do not support gdb_index with Ada yet.

We'll have to revisit this now that we have patches to support
.debug_names with Ada.

In this situation, we'll have "quasi encoded" names in the index -- that
is, Ada names that were decoded and then re-encoded, to drop the various

I am not totally sure what to do here.  Reading the CU DIEs to find the
language might be acceptable, but it might be ok to just try decoding
the name as well.

I think some kind of hack is appropriate given that .debug_names doesn't
even hold the correct names currently -- because when it does, we are
going to need some entirely different approach here, like reconstructing
full names as we search.

Joel> It's not perfect in the case of gdb_index handling,
Joel> but I think that the consequences of that peculiarity would be
Joel> contained within the gdb_index handling in dwarf2read.c. So,
Joel> at least, it wouldn't be a caller in dwarf2read.c passing an
Joel> unsuspecting symbol_matcher function defined elsewhere an incomplete
Joel> general_symbol_info.

I didn't understand this, because the symbol_matcher callback is passed
in to dw2_expand_symtabs_matching_symbol.

So, I think every symbol matcher needs to be prepared to handle this

I think it's fine to just document that the symbol might be "fake".
However, if we really want to be robust here, I guess could supply some
wrapper object that only reveals the bits we want revealed.

Joel> Attached is my prototype patch, with a small test that has one
Joel> fail because the patch is applied. The test is very simple,
Joel> and I intend to make a more complex one, but it should make it
Joel> easier for you to try this example should you like to.

What would the complex test do that this one does not do?

FWIW I've updated the patch to the latest source (actually I rebased it
on top of the .debug_names patch, to see how hard that would be -- not
very as it turns out).

Anyway, I'd like to move forward with this.


More information about the Gdb-patches mailing list