This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: C++ nested classes, namespaces, structs, and compound statements


Much to say, much to say...

On Fri, Apr 05, 2002 at 11:42:04PM -0500, Jim Blandy wrote:
> 
> At the moment, GDB doesn't handle C++ namespaces or nested classes
> very well.  I have a general idea of how we could address these
> limitations, which I'd like to put up for shredding M-DEL discussion.
> 
> Let me admit up front that I don't really know C++, so I may be saying
> stupid things.  Please set me straight if you notice something.

I know C++ fairly well, but my grasp of the technical terminology of
the language is lacking.  So don't go looking up any phrases I use in
here; I'm probably making them up :)  I'm sure Daniel Berlin can
correct any egregious errors.

> You can also declare "static" struct members --- you can access them
> with the `->' and `.' operators, just like ordinary members, but
> they're actually variables at fixed addresses in the .data segment ---
> much like a "static" variable in a C compound statement.  But this
> means that a simple offset from a base address is no longer sufficient
> to describe a struct's member's location --- you actually start
> needing something like GDB's enum address_class.  Multiple inheritance
> and virtual base classes introduce further complexity here.

I believe you're generalizing too much here.  Statics are a special
case; they're essential global variables, whose name is given in the
local class scope.  You've also got constant data members, which are
not necessarily backed by real symbols in C++ (I believe they always
are, in C...).  Everything else are members.

The complexity from multiple inheritence and virtual base classes is
essentially orthogonal.  It just affects the scopes you search.

> There's another difference between compound statements and structs
> goes away.  In C, you can only reference a struct's members using the
> `.' and `->' operators, whereas you refer to a compound statement's
> variables by simply naming them.  But in C++, a struct's member
> functions can refer to the struct's members by simply naming them.
> The struct's bindings become another rib in the search path for
> identifier bindings.
> 
> In summary, the data structure GDB needs to represent C++ structs
> (classes, unions, whatever) has a lot of similarities to the structure
> GDB needs to represent the local variables of a compound statement.
> They both need to carry bindings for several namespaces (ordinary
> identifiers and structure tags).  The names can refer to any manner of
> things: variables, functions, namespaces, base classes, and so on.
> For variables, there are a variety of locations they might occupy.

GDB already does a great deal of this by the very simple method of
using fully qualified names.  It's served us remarkably well, although
of course we're hitting its limits now.  But let's not be too quick to
discard that approach, for the present at least.

Also, while they're often both searched, don't confuse the structure
inheritence search path with the enclosing structure/namespace search
path.  For instance, foo->x() searches only the inheritence paths. 
foo->A::x is even worse (and gdb handles it badly or not at all at
present, as Michael mentioned in his mail a moment ago).

> So I would like to introduce to GDB a new type, `struct environment'
> (or is `struct env' better?) which does about the same thing that the
> `nsyms' and `sym' members of `struct block', and the `nfields' and
> `fields' members of `struct type', do now: it's just a bunch of
> bindings for names.  We would use `struct environment':
> 
> - in `struct block', to represent the block's local variables, replacing
>   `nsyms' and `sym';
> - in `struct type', to represent a struct's members, instead of
>   `struct fields'; and
> - in our representation for C++ namespaces, which seem pretty much
>   like structs that can only contain static members and member
>   functions (i.e., you can't ever create an instance of one).
> 
> There'd be a single set of functions for building `struct environment'
> objects, and looking up bindings in them; you'd use it for variable
> lookup, and in the `.' and `->' operators.  It could handle hashing,
> when appropriate.
> 
> Basically, we would take two distinct areas of GDB (and a third,
> namespaces, which we haven't implemented yet but will need to), and
> support them all with a single structure and a single bunch of
> support functions.  GDB would become easier to read.

How about -containing- `struct fields', instead of replacing?  i.e. let
the name search happen in the `struct environment', as before, but the
data items would be fields (could be indicated in a flag in the
environment, with a pointer to the type or symbol for the enclosing
structure).  I don't think turning members into symbols is a good idea.

As a side note, at the same time we should generalize our overloading
support to functions in addition to methods.  This would give the
framework to make that painless.  The environment could describe an
overloaded name...

> As a half-baked idea, perhaps a `struct environment' object would have
> a list of other `struct environment' objects you should search, too,
> if you didn't find what you were looking for.  We could use this to
> search a compound statement's enclosing scopes to find bindings
> further out, find members inherited from base classes, and resolve C++
> `using' declarations.

As I said above, I think that going this route is a bad idea.  It
should have a pointer to the enclosing object and to that object's
environment, probably, but that's the extent of it.

> How does this strike people?
> 
> Open issues:
> 
> - This "list of other places to search" thing may be ill-formed.  I
>   mean, sure, there are a set of similar behaviors going on there, but
>   are they similar enough?  For example:

We're thinking along the same paths here... I suspect that it is in
fact ill-formed.

> - What really happens when you start using `struct symbol' objects for
>   structure members?  Do we need new address classes now for `offset
>   from object base address'?  Does the LOC_COMPUTED idea I've been
>   pushing still work?

Why do we want members to be symbols?  A `struct field' expresses all
the properties of members; symbols have other properties.  I think we
use symbols in too many places already.

> - How do member functions work in this arrangement?  Virtual member
>   functions?  Virtual base classes?

If we leave searching out of it, we're fine on this front.

> - How would we introduce this incrementally?

Do we want to?

No, I'm serious.  Incremental solutions are more practical to
implement, but they will come with more baggage.  Baggage will haunt us
for a very long time.  If we can completely handle non-minimal-symbol
lookup in this way, that's a big win.




Some other thoughts:

This is all a question of scope.  As I said, right now we handle this
mostly by searching a small set of 'namespaces' (type, struct,
variable...) for a given term.  We try to look up fully qualified
names, and qualify them as necessary.  Scales fairly badly.  We want to
search for names in the appropriate scope, breaking up qualification as
necessary.  This is as opposed to search for fully qualified names. 
What we want is essentially:
  - The concept of an enclosing scope
  - Language dependant hooks to specify which scopes to search for a
    name.

We should be able to get a very nice behavior here, which we sort of
have now but not cleanly, and which well illustrates why the order to
search can be language dependant.  Consider:

struct X {
  int foo();
  int bar();
};
int baz();

int X::foo()
{
  ...
  int z = bar();
}

We're debugging in X::foo() and the user says 'list'.  They see
'bar()'.  They decide to ask 'print bar()'.  We want in this case
to find X::bar().  We do even (much) worse if X::foo is a static member
function.  On the other hand, this->bar() should work and this->baz()
should not.

My benchmark for whether a solution is even adequate enough to consider
is whether it obsoletes DW_AT_MIPS_linkage_name.  This should.  I'd
really like to make us independent of that, since it will make
constructor handling much less of a gross special case.  Daniel pointed
out at one point how much space this could save in binaries.

There's probably more I meant to say about this, but it's 1:30AM
here...

-- 
Daniel Jacobowitz                           Carnegie Mellon University
MontaVista Software                         Debian GNU/Linux Developer


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]