[PATCH] Add support for tracking/evaluating dwarf2 location expressions

Daniel Berlin dberlin@redhat.com
Fri Mar 30 14:42:00 GMT 2001

(If you aren't familiar, the address_class enum lists the ways you can
describe the location of a symbol. that's what i'm talking about when
i say address classes)
David Taylor <taylor@cygnus.com> writes:

>     Date: Fri, 30 Mar 2001 14:10:46 -0500 (EST)
>     From: Daniel Berlin <dberlin@redhat.com>
>     This patch adds a place in the symbol structure to store the dwarf2
>     location (an upcoming dwarf2read huge rewrite patch uses it), an
>     evaluation maachine to findvar.c, and has read_var_value use the dwarf2
>     location if it's available.
>     NOte that in struct d2_location, in the symtab.h patch, that framlocdesc
>     cannot be unioned with locdesc, because the symbol providing the frame
>     location description  usually has it's own real location.
>     Currently, because GDB makes no use of the frame unwind info, etc, I
>     evaluate the frame pointer location expression for whatever is providing
>     it for the thing we are trying to evaluate, rather than rely what GDB
>     tells us the frame pointer is.
> I have read that "sentence" multiple times and I still  don't know what
> you are trying to say.  Sorry.

I'm saying currently, the frame base in dwarf2 more descriptive than
the frame pointer in the frame info we have. 

>     Yes, it's currently pointless to do this,  it's just future proofing, and
>     what we are supposed to do, anyway.
>     This patch will have no effect whatosever without the dwarf2read.c
>     rewrite, but it won't break anything, neither, so i sepereated it out and
>     am submitting it now.
>     --Dan
>     2001-03-30  Daniel Berlin  <dberlin@redhat.com>
> 	    * symtab.h (address_class): Add LOC_DWARF2_EXPRESSION.
> 	    (struct symbol): Add struct d2_location to store dwarf2
> 	    location expressions.
> 	    * findvar.c (read_leb128): New function, read leb128 data from a
> 	    buffer.
> I don't know what leb128 data is (and I doubt that I'm alone).  I
> suspect that it is something dwarf of dwarf2 specific and that this
> function really belongs in another file (dwarf2read.c maybe?).

LEB128 = Little endian base 128
There is one in there. But we can't use it. It expects to be passed
bfd's .
> At a
> minimum there should probably be some sort of explanatory comment.
> 	    (evaluate_dwarf2_locdesc): New function, evaluate a given
> 	    dwarf2 location description.
> This is a very dwarf2 specific change to findvar.c.  What happens if
> something other than dwarf2 is in use?

Um, if it doesn't have a dwarf2 location for the symbol, it just does
what it used to.
> Also, should these functions be in here?  In dwarf2read.c?  Or in a
> new file (dwarf2eval.c?)?
Maybe dwarf2eval, but all the code it's related to is in findvar.c. 

I think you might not quite be understanding what dwarf2 location
expressions/lists are.

They are a way of describing where something is, over it's
lifetime (lists are, expressions are the location over a specific
range. So location lists are simply lists of location expressions, and
the associated address range where that expression is valid). However,
they are much more descriptive than GDB's current 
support for address classes (IE LOC_BASEPARM, etc), and thus,
occasionally we hit ones we can't transform into GDB's other simple
address classes.
It would also be flat out impossible to support location lists, since
gdb's address classes only expect a symbol to be at one location, over
it's lifetime.  So in order to be able to support optimized code
debugging, etc, you need to be able to evaluate the dwarf2 location
list at runtime, to get the address of a particular symbol, at a
particular time.

In fact, dwarf2 location expressions are actually a turing complete
language.  They are evaluated by a stack machine.

If you still aren't grasping it, let me give you an example:

Let's say we have a very simple c file


int main(int argc, char *argv[])
        int a;

If you compile this file with dwarf2 info, the location of a will be
described (most likely) as the location expression "DW_OP_fbreg 8".

When we read the dwarf2 info, we currently try to convert "DW_OP_fbreg
8" into a gdb address class/location, which in this case, is
doin this from memory, it's probably completely wrong, but i'm just
trying to show something, not be absolutely correct).

If all of these location expression opcodes were simple, we wouldn't
have a problem (we also wouldn't be able to describe things anywhere
near as well as we could).  However, you also have opcodes like
DW_OP_deref, which is supposed to take the top of the current
dwarf2 location machine stack, dereference as if it was a pointer, and
push  the resulting value back onto the top of the stack.  Obviously,
since we don't have access to target memory at symbol reading time, we
can't do this (their is currently a hack that lets us do it if
DW_OP_deref is the last expression). This is because we are expected
to be able to evaluate them at runtime, but because gdb had no way of
evaluating them until how, we had to settle for trying to convert it
into something gdb could handle, and as they get more complex, or when
you try to do something that needs the expressive power, you fail.
We can't even come close to the expressive power with the other
address classes.  There are things like DW_OP_piece that let you say
"the first byte of this variable is in in this place, the next 2 bytes
are here, the last byte is here", etc.  
As I said, it's a turing complete language. 

Keep in mind, the above is just a single location expression. 
Location lists allow you to fully describe the lifetime of even a
variable in optimized code, by associating expressions with ranges of

This code evaluates dwarf2 location expressions to get the location of
a variable, if a symbol has one.  This is used in preference to using the
less expressive way gdb currently supports.

All i've done is add a new way to describe the location of a variable,
and the associated way to evaluate that description.

So the code doesn't belong in dwarf2read.c.  dwarf2read.c is a symbol
reader. dwarf2 location expressions are a way of describing variable
locations.  The other code to process the current gdb address
classes/locations is in findvar.c, so that's where i put it.

> 	    (read_var_value): Use the dwarf2 location expression if it's available.
> How do you know that the location expression is NULL when you haven't
> set it -- in particular, what's to prevent it from having garbage in
> it?
Because the symbol gets memset to 0.

> And is it likely that some other symbol reader would someday want a
> similar hook?
It's not a hook really.
The dwarf2reader never calls it, is has no relation to symbol reading.

It is related to describing the location of a variable at runtime.

>  For example, what about dwarf1?
>   Or is dwarf2 likely to be the only one?

No. DWARF2 is the only symbol format i know of which has support for
something as expressive as location expressions/lists.

GCC uses DWARF2 call frame annotation (which is location lists paired
with registers to tell you were a given register is at a given pc) to
do it's exceptions, for instance.  

If we ever want to be able to use the unwind info or call frame info
to be able to do optimized code debugging without much trouble, we
need to be able to evaluate the expressions.

If it helps, pretend the word DWARF2 appears nowhere in the
patch. There is no relation to the symbol reader.

I could convert the stabs reader to use this way of describing
locations, there's just no point (since stabs's locations aren't more
expressive than what gdb supports).

I play the harmonica.  The only way I can play is if I get my
car going really fast, and stick it out the window.  I've been
arrested three times for practicing.

More information about the Gdb-patches mailing list