This is the mail archive of the
mailing list for the GDB project.
partial-symtab symbol sorting (was: "Re: GDB 7.4 branching status? (2011-11-23)")
This is an issue that is only tangential to this patch, and should
not be considered part of the series.
I just realized that partial symbols are sorted using strcmp_iw_ordered.
This works great for C++, for instance, but only works OK for Ada.
I think that this is related to the fact that we might be using
the linkage name, rather than the natural name (we compute the natural
name only on-demand, due to memory pressure in large apps). As a result,
the strcmp_iw_ordered routine can return non-zero for two names that
ada-lang.c:compare_names would consider equal.
For instance: `pck__hello' and `pck__hello__2'.
So when doing a symbol lookup for pck__hello, for instance, we pass
our own comparison routine, which is "compatible" with
strcmp_iw_ordered to the psymtab map_matching_symbols routine.
This allows us to perform a binary search rather than linear one.
I am wondering if we shouldn't be sorting the partial symbols
using a language-specific sorting routine instead. As it turns
out, there was a bug in ada-lang.c:compare_names and that could
have caused the two search orders to diverge. The thing is, when
I looked at it, it's not easy just looking at the partial symtab
what language it is. The language seems to be embedded in the
symbols themselves. And then, we'd still have to specify whether
we'd want to perform a binary search or not anyways. That's because
we permit "wild" matches:
(gdb) break hello
In the case above, we must break on "pck__hello" and "pck__hello__2".
In that case, binary searches based on string comparison cannot work
because we're missing the start of the symbol linkage name.
As an aside: One of the ideas I had in the past was to store the natural
name inverted - for instance "hello.pck" instead of "pck.hello". That
way, searches for hello could be done using binary searches as well.
It might actually allow us to reconcile "wild" vs "non-wild" searches,
and even allow "semi-wild" lookups as in:
(gdb) break subpackage.hello
... would now manage to find matches such as package.subpackage.hello.
This is not the case today. You either fully qualify your symbol name,
or you don't qualify it at all.
But, without even thinking about performance issues at startup, this
approach suffers from the same problem as storing the natural name does:
For certain large applications, we would exceed the maximum amount of
memory a process can hold. This is not necessarily on GNU/Linux, but
the problem is there. I think we'll have better luck we are capable of
merging a bit the massive duplication in the debug info.