This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

relocate branch


Over the last several days I've implemented most of the libdw lazy
relocation handling.  This is the rest of the plan for which Petr
implemented all the low-level reader hooks a while ago.

This work is on the roland/relocate branch.  I don't expect to merge this
branch in very quickly, probably not until we are ready to start using it
in the dwarf branch C++ stuff.

The core of it is that libdw itself groks the relocation sections in ET_REL
files.  It only handles the basic 4 and 8 byte absolute relocation types,
the same ones that libdwfl understands today.  These are all that ever
ought to appear in DWARF sections (or any other unallocated section).
This probably covers all the relocs in .eh_frame for non-PIC code too
(i.e. kernel modules, should those one day get .eh_frame again).

The internal workings are pretty lazy on multiple levels.  (We hope this
means that things like just loading up 2000 kernel modules into a Dwfl
before you have looked at them will get much faster.  We'll see.)  At
dwarf_begin_elf time, all we do is notice that there are relocation
sections applying to our DWARF sections, and cache those Elf_Scn pointers.
The first time we need to look for a relocation in a particular DWARF
section, we digest the corresponding relocation section into internal form;
this just reads all the relocation records and looks at their types, so it
should be pretty quick.  Thereafter, the digested relocations exist as two
parallel sorted arrays of pointers into the DWARF section data, one for
4-byte relocs and one for 8-byte relocs.  Whenever we need to look up a
value that can be relocatable (__libdw_read_{address,offset}*), we do a
binary search (with a hint that should short-cut for sequential access
patterns) on that array.

For the internal uses (__libdw_formptr), relocations are always to defined
symbols (usually just section relative) in a .debug_* section, so we can
always resolve those.  For files using REL instead of RELA (default on
i386), section-relative relocs (the normal case) usually don't actually
require any relocation work, because the symbol adds nothing (implicit
sh_addr is 0, st_value is 0 in a section symbol) and the reloc addend is
already in place in the data--so it's as if there were no reloc there at
all, and we don't even record it when we digest those relocs.

The interesting relocs are the ones for actual addresses rather than just
DWARF offsets, i.e. whose target symbols are in SHF_ALLOC sections.  The
ways these get used are twofold.  

Everything done via the existing interfaces has to resolve to a final
address value while reading.  Those all go through __libdw_read_address.
If those have relocs, they might be to nonvariant symbols like a SHN_ABS.
If not, they need to be resolved.  In the default case using plain libdw,
this just fails with a new error "value requires relocation".  When using
libdwfl, this will (eventually) use the libdwfl symbol resolution code,
that looks up in module symbol tables, and uses the Dwfl section_address
callback.  So to the existing libdwfl user, it will look the same, except
that section_address callbacks will be made on demand during dwarf_* calls
rather than all at once at dwfl_module_getdwarf time.  I have not started
on updating libdwfl to tie into this yet.

The more interesting case is reloc-aware libdw users.  This is what really
motivates this work now.  The DWARF reader for ET_REL files has to be
reloc-aware and feed that knowledge into the DWARF writer to produce its
own relocs in the DWARF output.  We'll need that for DWARF compression to
apply to Linux kernel modules.

The C interface for reloc-aware libdw revolves around Dwarf_Relocatable.
This is a new "opaque" but user-allocated struct.  It is much like
Dwarf_Attribute: it just points to the data that needs to be decoded (and
possibly relocated).  Various new calls return these (i.e. fill in a
user-supplied Dwarf_Relocatable) to stand in for Dwarf_Addr.  

dwarf_form_relocatable corresponds to dwarf_formaddr.  It just converts a
Dwarf_Attribute into a Dwarf_Relocatable that you can use later.

dwarf_line_relocatable corresponds to dwarf_lineaddr.  It fetches the PC
from a Dwarf_Line as a Dwarf_Relocatable.

dwarf_ranges_relocatable corresponds to dwarf_ranges.  It iterates over PC
ranges where each bound is a Dwarf_Relocatable.

dwarf_haspc_relocatable corresponds to dwarf_haspc, and is a convenience
function built on dwarf_ranges_relocatable as dwarf_haspc is built on
dwarf_ranges.  It takes the PC to match as an opaque Dwarf_Relocatable, and
there is not yet any other way to directly compare those to each other.  It
matches a relocatable address in a relocatable range if they are relocated
relative to the same section, and their offsets overlap within the section.
e.g., you might pass a Dwarf_Relocatable from dwarf_line_relocatable to
dwarf_haspc_relocatable if doing a "which scope is this source location
in?" check.

dwarf_getlocation_relocatable corresponds roughly to dwarf_getlocation or
dwarf_getlocation_addr.  But since a "lookup" doesn't necessarily make
sense for unrelocated addresses, the interface is more like dwarf_ranges.
It is an iterator over the location list, yielding the range (as a pair of
Dwarf_Relocatable) and the location expression for that range.

We could have a lookup version that works akin to dwarf_haspc_relocatable.
But we'll see what comes up as needed for that.  The iterators are what we
need to feed the writer side for compression.  (dwarf_haspc_relocatable is
not needed for that, but was easy and obvious.)

So what is a Dwarf_Relocatable?  It's very similar to a Dwarf_Attribute:
it's just a pointer to undecoded DWAR data.  All these calls yielding a
Dwarf_Relocatable do not actually consider relocation at all.  They even
work just the same on non-ET_REL objects that will never require relocation
handling.  The actual search for a relocation record is not done until you
call dwarf_relocatable_info.  (dwarf_haspc_relocatable does this inside.)

What the Dwarf_Relocatable represents is the information that a relocation
record has: a symbol table entry and an "addend" (i.e. offset from the
address indicated by the symbol).  In RELA relocs, this is the r_addend
field in the reloc.  In REL relocs, the addend is implicitly stored in the
section data itself.  When there is no relocation at all, it has the same
effect as a REL reloc with no symbol: i.e., the whole address is just the
"addend" stored in the data.  

An actual reloc can in some cases really have "no symbol"--the
GELF_R_SYM(r_info) is zero (aka STN_UNDEF) rather than a proper symtab
index.  eu-strip produces relocs like this from useless section-relative
relocs for nonallocated sections.  How do treat these is what I really
meant by "no symbol": as if it were a symtab entry with st_shndx of SHN_ABS
and st_value of 0.  (You might also have a real symtab entry like that,
though they are useless.)

The "addend" inside a Dwarf_Relocatable is not always just the r_addend
from an actual ELF reloc record (or equivalent in-place value for REL).
In some of the DWARF formats addresses are encoded as relative to other
addresses in the DWARF.  In .debug_line data, only the
DW_LNE_set_address operations set a relocatable address, and other
operations adjust the PC relative to that.  So, a Dwarf_Relocatable from
dwarf_line_relocatable will indicate the symbol from an ELF reloc, with
the addend from the reloc plus the additional addend from DW_LN*
adjustments.  Similarly, in .debug_ranges and .debug_loc, addresses are
added to a base address coming from an earlier record or from a CU's
DW_AT_low_pc attribute.  In those cases, we only permit one of the two
addresses being added together to be relocatable.  (The only reason for
these relative-address format details is so that many DWARF items can be
encoded as relative to one DWARF datum encoded with an ELF reloc.)  So
in the Dwarf_Relocatable for these, the addend is (e.g.) that of the
reloc for the relocatable base address, plus the value of the individual
range list entry.

The dwarf_relocatable_info call extracts the symbol table entry in handy
form, and tells you the addend.  It gives you the GElf_Sym, but it
doesn't actually tell you the symtab index if you wanted to construct an
ELF reloc record exactly matching or something like that.  I imagine
that in the C++ interface this will be exposed as a pair of addend and
iterator into an STL-style container for the ELF symbol table
(std::vector work-alike).

Note that for the libdwfl world, you need to know which file that symbol
table resides in to make sense of its section indices.  I haven't worked
out how best to gloss that yet.

So, what's not yet done?

Firstly, the libdwfl integration is not done at all.  This is the
presumed reason for the one 'make check' regression the branch has now.
I haven't investigated that yet.  The first task will be just to
rejigger the libdwfl innards so that they don't do the existing
__libdwfl_relocate work beforehand.  Instead, libdwfl will poke pointers
into the libdw internals so that __libdw_read_address hooks call the
libdwfl symbol resolver.

In libdw proper, there are still some __libdw_read_address cases that
don't have relocation-aware parallel interfaces.

The .debug_aranges support (dwarf_getaranges) et al has no reloc-aware
variant.  I'm not sure if we should bother with one.  I'll have to
figure out if libdwfl can make dwarf_addrdie work without one.  The
libdwfl uses are the only ones where .debug_aranges in an ET_REL really
means anything.  The writer will always produce .debug_aranges from
whole cloth and not even want to read the incoming .debug_aranges.

CFI support needs to be reloc-aware at some point.
I'll talk more about CFI-related work separately.
I've been doing some hacking that's related.

The other thing missing support is in DWARF expressions.  We have
dwarf_getlocation_relocatable for handling relocatability in the
location lists, but that just means the list address ranges.  There is
also DW_OP_addr inside an expression block, which can use an address
with a relocation record.  (Possibly we should let DW_OP_const[48][su]
have a reloc too.  We don't use __libdw_read_address for those as yet.)

For that, we probably need a look-aside interface along the lines of
dwarf_getlocation_implicit_value.  I may leave that until the C++ side
grows the code that would look inside expressions on the reader, which
will be needed for compression not to break relocatable DW_OP_addr's.
(It's also needed for DW_OP_call* that refer to the DIE tree, but those
don't seem to appear in nature.)

And, of course, it's all only barely tested at best.
The new interfaces are not tested at all.

I'll probably work next a bit more on CFI since I've started hacking on
it.  (I'll post separately about that.  But CFI handling is not a target
for compression in the near to medium term.)  Then I'll try to tackle
libdwfl so that the branch doesn't regress for existing libdwfl users.
That will let me test the branch with stap, and maybe even find out if
it actually helped with speed/VM-load.


Thanks,
Roland

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]