This is the mail archive of the mailing list for the Archer project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Improved linker-debugger interface

Hi all,

These past few weeks I've been working on the following bug:
  "_dl_debug_state() RT_CONSISTENT called too early"

and trying to figure out a fix that also addresses this:
  "improve GDB performance on an application performing
  a lot of object loading."

and this:
  "gdb does not detect calls to dlmopen"

It's taken me a few tries, but I think I finally have an interface
that will work.  This mail is so I can run it past a wider audience
before I start implementing the gdb side in earnest.

The current linker-debugger interface has a structure (r_debug)
containing a list of loaded libraries, and an empty function
(_dl_debug_state) which the linker calls both before and after
modifying this list.  It exists solely for the debugger to set
a breakpoint on.

  - There is one place where glibc calls _dl_debug_state earlier than
    Solaris libc.  This is #658851.  It is unlikely that glibc will
    ever be changed to make it compatible with Solaris libc as this
    could break compatibility with previous glibcs.

  - This interface was presumably invented before dlmopen() was, so
    there's only provision in it for one namespace.  In glibc each
    namespace has it's own r_debug structure, but there is no way for
    the linker to communicate the addresses of the others to the
    debugger.  This is PR 11839.

  - In normal use gdb only needs to stop _after_ the list is modified.
    Because _dl_debug_state is called both before and after, gdb stops
    twice as often as it needs to.  This is #698001, the gist of it at
    any rate.  
  - When stop-on-solib-events is set, however, it is either necessary
    or at least useful to stop both before and after library loads.
    I'm not 100% sure on this point, but if nothing else it preserves
    the existing behaviour, and without it it probably becomes a lot
    more difficult to debug linker issues and possibly other stuff too.

The solution I'm proposing is this:

  - Add a new function for gdb to break on, _dl_debug_state_extended,
    to supplement _dl_debug_state.  Everywhere that _dl_debug_state is
    called, _dl_debug_state_extended is also called.  In the case
    where glibc calls _dl_debug_state earlier than Solaris libc we
    move the call to _dl_debug_state_extended to the Solaris location.
    This fixes #658851, with the caveat that some provision must be
    made for halting before STT_GNU_IFUNC relocations.

  - _dl_debug_state_extended contains SystemTap probes, currently two,
    but extendable to as many as we want.  gdb will look for these
    probes, and if it finds them it will set breakpoints on the ones
    it is interested in, falling back to breaking on _dl_debug_state
    if the required probes are not found.  The two current probes are
    r_debug_mod_starting (called before the list is modified) and
    r_debug_mod_complete (called after).  gdb will always stop on
    r_debug_mod_complete, to update its internal lists, but it will
    only stop on r_debug_mod_starting when stop-on-solib-events is
    set.  This halves the number of stops, addressing #698001.

  - All probes have as namespace id and r_debug address arguments,
    allowing gdb to discover namespaces other than the default.  This
    opens the door to fixing PR 11839.

I've attached a patch for glibc that implements its side of the
interface, designed to apply after the SystemTap support that was
added for Fedora 15, so if what I've written doesn't make sense then
maybe at least the code will!

Does this all seem ok?



Attachment: glibc.patch
Description: Text document

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]