This is the mail archive of the frysk@sourceware.org mailing list for the frysk project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Dwarf/libdw question


Hi Sami.  Please use more specific Subject lines in your postings.
Reading the list archives' index will not be very informative to
someone looking years from now for discussion on this particular topic.

> I am working on implementing c++ scoping rules in frysk. Is there 
> elfutils API that I can use to figure out what class/struct a function 
> belongs to, so that references to member variables  can be resolved.

The key is DW_AT_specification.  Let's take an example:

	class c
	{
	  int m1() { return 17; }
	  int m2();
	public:
	  int m() { return m1() + m2(); }
	};

	int c::m2() { return 23; }

	int main()
	{
	  c x;
	  return x.m();
	}

The DIE tree for this is (explanations below):

	 [     b]  compile_unit
		   macro_info           0
		   stmt_list            0
		   producer             "GNU C++ 4.1.2 20070502 (Red Hat 4.1.2-12)"
		   language             C++ (4)
		   name                 "s.cxx"
		   comp_dir             "/home/roland/build/stock-elfutils"
	 [    67]    structure_type
		     sibling              [    d4]
		     name                 "c"
		     byte_size            1
		     decl_file            1
		     decl_line            2
	 [    71]      subprogram
		       sibling              [    94]
		       external             
		       name                 "m1"
		       decl_file            1
		       decl_line            3
		       MIPS_linkage_name    "_ZN1c2m1Ev"
		       type                 [    d4]
		       accessibility        private (3)
		       declaration          
	 [    8d]        formal_parameter
			 type                 [    db]
			 artificial           
	 [    94]      subprogram
		       sibling              [    b7]
		       external             
		       name                 "m2"
		       decl_file            1
		       decl_line            4
		       MIPS_linkage_name    "_ZN1c2m2Ev"
		       type                 [    d4]
		       accessibility        private (3)
		       declaration          
	 [    b0]        formal_parameter
			 type                 [    db]
			 artificial           
	 [    b7]      subprogram
		       external             
		       name                 "m"
		       decl_file            1
		       decl_line            6
		       MIPS_linkage_name    "_ZN1c1mEv"
		       type                 [    d4]
		       declaration          
	 [    cc]        formal_parameter
			 type                 [    db]
			 artificial           
	 [    d4]    base_type
		     name                 "int"
		     byte_size            4
		     encoding             signed (5)
	 [    db]    pointer_type
		     byte_size            8
		     type                 [    67]
	 [    e1]    subprogram
		     sibling              [   10d]
		     specification        [    71]
		     low_pc               0x000000000040054c
		     high_pc              0x000000000040055b
		     frame_base           location list [     0]
	 [    fe]      formal_parameter
		       name                 "this"
		       type                 [   10d]
		       artificial           
		       location             2 byte block
			[   0] fbreg -24
	 [   10d]    const_type
		     type                 [    db]
	 [   112]    subprogram
		     sibling              [   13f]
		     specification        [    94]
		     decl_line            9
		     low_pc               0x0000000000400528
		     high_pc              0x0000000000400537
		     frame_base           location list [    4c]
	 [   130]      formal_parameter
		       name                 "this"
		       type                 [   10d]
		       artificial           
		       location             2 byte block
			[   0] fbreg -24
	 [   13f]    subprogram
		     sibling              [   16b]
		     specification        [    b7]
		     low_pc               0x000000000040055c
		     high_pc              0x0000000000400587
		     frame_base           location list [    98]
	 [   15c]      formal_parameter
		       name                 "this"
		       type                 [   10d]
		       artificial           
		       location             2 byte block
			[   0] fbreg -32
	 [   16b]    subprogram
		     external             
		     name                 "main"
		     decl_file            1
		     decl_line            11
		     type                 [    d4]
		     low_pc               0x0000000000400538
		     high_pc              0x000000000040054b
		     frame_base           location list [    e4]
	 [   18c]      variable
		       name                 "x"
		       decl_file            1
		       decl_line            13
		       type                 [    67]
		       location             2 byte block
			[   0] fbreg -17

Note that the subprogram DIEs describing actual machine code are
top-level children of the CU.  Here these are [e1], [112], [13f].  They
are not children of [67], the structure_type DIE describing the class.
This is sensible enough because these are global function definitions,
even if they have names and types with scope limited to the class.

Consider [112].  This has the attributes and children that refer to its
machine code (low_pc, high_pc, frame_base, formal_parameter).  Note it
does not have the attributes like name and type.  Instead, it has a
specification attribute that points to [94].  specification is
analogous to abstract_origin, but rather than linking a concrete code
element to an abstract inline definition, it links a concrete code
element to an abstract declaration.  So, [112] is the code for "m2",
and [94] is the specification for "m2".

dwarf_attr_integrate checks for specification as well as abstract_origin.
So, for common cases with attributes you just don't think about it.
dwarf_diename uses dwarf_attr_integrate, so you will see a name without
extra effort even if it's indirect.

I used [112] as the example because m2 is defined outside the class
definition.  As you can see, GCC does the same thing for m1 [e1] and m
[13f], though those definitions actually appear lexically inside the
class.  Reading the DWARF spec one would expect these cases to use a
single DIE inside the class and not use DW_AT_specification at all.  I
don't know if there is a particular reason GCC doesn't do that, and I
see no big benefit in changing what it does.  But I think that DWARF
consumers should expect that either style might be used and work the
same with either.

Note how [112] has a decl_line attribute but no decl_file, while [e1]
and [13f] have neither.  This is an example of the general rule with
specification (and abstract_origin): it's elided if it's not different.
Since m2's body was defined outside the class, [112] refers to line 9.
If the class declaration were in a header file and the method definition
in another file, there would also be a decl_file attribute.  (If
everything were all on one line and the compiler emitted column
information, there would be a decl_column but no decl_line.  The
compiler does not yet emit decl_column attributes, but we should write
consumers as if it did.)  Since [e1] and [13f] describe bodies defined
in their selfsame specification declarations, they would never have a
decl_{file,line,column} of their own.

So now I've told you the basics to work with, but not actually answered
your question.  There are two parts to resolving class members.

First, the name resolution per se.  First there are scopes inside a
subprogram DIE, same as in C.  When you are dealing with a class method,
the subprogram's specification attribute gives you the declaration
inside the class scope (use dwarf_formref_die (dwarf_attr (...))).  Then
use dwarf_getscopes_die on that to see the class, namespace, etc. scopes
containing it.  For each of those, see if they have DW_TAG_inheritance,
DW_TAG_imported_declaration, etc. children that contribute more scopes
to the name resolution logic for the language.  Among those you find a
member, variable, subprogram, etc. DIE by the name you are looking for.

If you found a static member (aka class variable), i.e. DW_TAG_variable,
you are done.  It gets treated just like other variable DIEs.

If you found a class member (aka instance variable), i.e. DW_TAG_member,
then it depends on how you plan to use it.  For the context of a pointer
to member (as "mem" in "type cl::*p = &cl::mem;"), then you are done.
The DW_AT_data_member_location tells you what value to use.

In a static method (aka class method), referring to a regular class
member (instance variable) is invalid.

In an instance method, "mem" is resolved the same as "this->mem".  The
subprogram DIE for the method definition contains an automatically-inserted
first formal_parameter DIE, with the artifical attribute and named "this".
AFAICT, the only way to distinguish a static method from an instance method
in the DWARF tree is the presence of this first artifical formal_parameter.
(Though in practice it always has the name attribute of "this", I would
write it to detect a first formal_parameter with artifical rather than
looking at the name.)  This formal_parameter is like any other aside from
being artifical, so you combine its location attribute with the PC context
you're looking from, and data_member_location attribute of the member DIE
to find the member in the object from that PC context.

When the name resolved to a subprogram DIE, you have to do two things to
see how to treat it.  First, if the DIE has DW_AT_declaration, then you
have to find the concrete code DIE whose DW_AT_specification points to it.
Then, you have to check (as above) whether it's a static method or an
instance method, so you know what "name(foo)" is supposed to mean if a user
gave that as a call.


Thanks,
Roland


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]