This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Non-stop multi-threaded debugging


Wow -- too long to read in one sitting, but sounds like 
this should be very interesting and challenging work!

I'm sure we'll all look forward to reviewing it, and if
there are issues to be discussed, I for one will look
forward to the discussions ;-)


On Tue, 2007-11-20 at 17:21 +0000, Nathan Sidwell wrote:
> Hi all,
> 
> Jim Blandy prepared this, but is on vacation this week.  So, I'm announcing it 
> in his absence.  Pretend I wrote 'sudo jimb ...'
> 
> A client of CodeSourcery's has contracted with us to implement a
> number of new features in GDB, some of which have been on the
> frequently requested list for quite some time:
> 
> - We're to implement non-stop multi-threaded debugging in GDB.
> 
>    At present, if you are debugging a multi-threaded program, when one
>    thread stops (for a breakpoint, watchpoint, exception, or the like),
>    GDB stops all other threads in the program while you interact with
>    the thread of interest.  When you continue or step a thread, you can
>    allow the other threads to run, or have them remain stopped, but
>    while you inspect any thread's state, all threads stop.
> 
>    In non-stop mode, when one thread stops, other threads can continue
>    to run freely.  You'll be able to treat each thread independently,
>    leaving it stopped or free to run as needed.
> 
>    Non-stop mode will be selectable; the old all-stop behavior will
>    still be available.
> 
> - We're to implement asynchronous interaction with GDB.
> 
>    GDB will be responsive to commands while the program is running.
>    This is mostly a consequence of supporting non-stop multi-threaded
>    debugging: it's the degenerate case where no threads happen to be
>    stopped.
> 
> - We're to implement a limited form of multi-process debugging.
> 
>    Full multi-process debugging would entail changes to
>    1) process management code,
>    2) target interfaces, and
>    3) symbol tables.
> 
>    For our client, however, the case where processes have different
>    memory maps is not (yet) of interest, so they have sponsored us to
>    do 1) and 2), but not 3).  This will yield a GDB that can (for
>    example) follow both parent and child after a fork, but not follow
>    processes across exec or dlopen/dlclose operations.  If a process
>    carries out one of these operations, GDB will ask the user whether
>    to follow that process only, or detach from it and stick with the
>    others.
> 
>    So our goal here is to carry out steps 1) and 2) in such a way that
>    anyone can easily pick up 3) and complete the feature.  In other
>    words, we want the restrictions simply to be a matter of leaving
>    work undone, and not of embedding simplifying assumptions into the
>    code that would make full support difficult.
> 
> Our client would very much like for this work to be incorporated into
> the public GDB sources (although they understand that the decision is
> in the public project's hands), so we'll be posting our design
> thoughts for general discussion.  In particular, I believe the
> multi-process work may overlap with some of the work IBM has done to
> support the Cell processor; we'd very much like to work with IBM to
> ensure that the final model is appropriate for both our client and for
> Cell developers.
> 
> Our client is only interested in the MI interface; they intend to use
> all these facilities via Eclipse.  So we will not be implementing
> command-line support any more than is helpful to us in development.
> But again, we want to do this work in a way that leaves CLI support
> for these features a simple matter of coding, so that our work is
> still forward progress, which anyone can complete.
> 
> Our client is interested in non-stop, multi-process debugging via the
> remote protocol.  However, we will be implementing these for native
> debugging first, in order to break the work into manageable steps.
> 
> The below is taken from a more detailed document we put together
> proposing the work.  It is in two sections:
> 
> - The "Architectural Challenges" section explains limitations of GDB's
>    current architecture that make it difficult to implement non-stop
>    and multi-process debugging at present.
> 
> - The "Projects" section presents a series of well-defined engineering
>    projects which remove limitations or add features to meet one or
>    more of our client's requirements.
> 
> Our intention is to help the list understand why each piece of work is
> needed and what it would accomplish.
> 
> 
> Architectural Challenges
> 
> GDB's present architecture imposes a number of barriers to
> implementing non-stop and multi-process debugging:
> 
> C1) While the user inspects the state of a stopped thread, GDB stops
>      all other threads.  This approach simplifies GDB's user interface,
>      as there is no need to report events taking place in other threads
>      while the user inspects one thread.  However, these
>      simplifications are no longer valid in non-stop debugging
> 
> C2) Stopping all threads also simplifies GDB's execution management
>      code, as GDB can pause all threads, manage interesting events, and
>      then assume the system is quiet.  As above, these simplifications
>      are no longer valid in non-stop debugging.
> 
> C3) Stopping all threads further allows GDB to remove all breakpoints
>      from the program's memory while the program is stopped, and
>      re-insert them only when resuming one or more threads, making it
>      less likely that an abrupt disconnection will abandon a debuggee
>      with breakpoint instructions patched into its code.  However, this
>      behavior is clearly unsuitable if the user wants other threads to
>      continue to execute while she stops one for inspection.
> 
> C4) Finally, stopping all threads simplifies GDB's remote protocol.
>      At present, GDB's remote protocol notifies GDB of exactly one
>      thread's state in response to each 'continue' or 'step' operation,
>      permitting no further packets from the stub until GDB resumes
>      some thread.
> 
> C5) GDB breakpoints are currently per-thread or global.  To satisfy
>      our client's requirements, we must adapt these structures to
>      distinguish per-process and global breakpoints, where 'global'
>      breakpoints are set in all attached processes.
> 
> C6) [Our client elected not to address this issue yet.]
> 
> C7) GDB currently operates on a single process at a time: the list of
>      known threads is global, and the ID of the process being debugged
>      is global.  This conflicts with the needs of multi-process
>      debugging.
> 
> C8) GDB currently maintains a single global map of the address space.
>      It cannot represent multiple processes with code and data
>      appearing at different addresses in different processes.  This is
>      not a problem for our client, because code and variables appear at
>      the same addresses in all processes on their system.  However, it
>      is a requirement for multi-process debugging on Linux.
> 
> C9) GDB will not currently relocate different segments of an
>      executable or shared library by different offsets from the
>      addresses they are assigned in the ELF file.  The client's
>      operating system may relocate each section of a load module by a
>      different amount.
> 
> 
> Projects
> 
> This section breaks down the work necessary into well-defined
> engineering tasks.  For each proposed project, we explain the work
> entailed, the benefits provided, and how it depends on other projects,
> if at all.
> 
> 
> P1) Non-stop multi-threaded native debugging
> 
>      This project allows GDB to stop one thread for inspection on a
>      native system while allowing others to run.
> 
>      To prepare GDB to debug one process while other processes continue
>      to run freely (the feature our client is interested in), we will
>      first implement the ability to debug one thread while other
>      threads in that process continue to run freely.
> 
>      As described in C1, C2, and C3, GDB assumes in its user interface
>      and code that no execution occurs while the user is inspecting a
>      thread's state.  This project removes that simplifying assumption.
> 
>      At the user interface level, GDB's Machine Interface ('MI', the
>      command set used by Eclipse) shall behave as follows:
> 
>      - MI shall provide a command to allow the user to choose between
>        the older 'all-stop' and the new 'non-stop' multi-threaded
>        debugging behaviors.  In all-stop mode, GDB shall behave as it
>        does now.  The following points describe non-stop debugging
>        mode.
> 
>      - GDB shall always prompt for and respond to MI commands,
>        regardless of whether any threads are running or not.
> 
>      - When a thread finishes a command like '-exec-next' or
>        '-exec-finish', hits a breakpoint, or encounters a fault, GDB
>        shall stop that thread, without affecting the other threads in
>        the process.
> 
>      - Execution commands like '-exec-continue' and '-exec-step' shall
>        resume only the selected thread, without affecting the other
>        threads in the process.
> 
>      - The MI '-exec-interrupt' command shall stop all threads.  This
>        will always generate an 'EXEC-ASYNC-OUTPUT' record, even if all
>        threads were already stopped.  (This helps users handle the case
>        where the thread stops of its own accord just as the user sends
>        it an '-exec-interrupt' command.)
> 
>      - The MI '-thread-select' command shall stop the thread selected,
>        if it is running.  The previously selected thread is left in its
>        former state, either stopped or running.  A '-thread-select'
>        command shall always generate an 'EXEC-ASYNC-OUTPUT' record,
>        even if the thread was already stopped.
> 
>      - MI shall provide a command to continue all stopped threads.
> 
>      - GDB shall send 'EXEC-ASYNC-OUTPUT' MI records to notify the user
>        of events that have occurred in threads, even while GDB is
>        waiting for an MI command.  Every thread GDB stops shall be
>        mentioned in some 'EXEC-ASYNC-OUTPUT' record; when GDB stops all
>        threads, the EXEC-ASYNC-OUTPUT record shall include a
>        'thread-id="all"' result.
> 
>      - The MI '-thread-info' and '-thread-list-all-threads' commands
>        shall be implemented.  Their output shall indicate whether each
>        thread listed is currently stopped by GDB, or whether it is
>        allowed to run.
> 
>      - GDB shall use 'EXEC-ASYNC-OUTPUT' MI records to report thread
>        creation and termination.  These records shall include the GDB
>        thread number as a result.  After sending a thread termination
>        record, GDB shall not include the thread in the output of
>        '-thread-list-ids' or '-thread-list-all-threads'.
> 
>      (Adapting GDB's command-line interface to non-stop debugging is
>      more involved; whereas MI need only be accurate and sufficient,
>      the command-line interface must also respect human interface
>      issues.  Since GDB's command-line interface is of limited interest
>      to our client, we have not included it here.)
> 
>      To implement the behavior described above, a number of areas
>      within GDB will need modification:
> 
>      - GDB's event loop must be responsive to user input and thread
>        events from the debuggee simultaneously.
> 
>      - GDB's execution control code must avoid stopping all threads
>        when one reports an event, and must make the processing of
>        thread stops independent of resumption: it must no longer assume
>        that events only arrive after resumptions, and resumptions only
>        happen after events.
> 
>      - GDB must insert breakpoints into code being executed by live
>        threads in a manner supported by the target architecture.
> 
>      - GDB's breakpoint support code must leave breakpoints inserted at
>        all times.  Even while GDB steps a thread past a breakpoint,
>        the breakpoint must remain in effect for all other threads.
> 
>      These are each reasonably substantial pieces of work, the design
>      of which should be discussed on the public GDB list to ensure that
>      the work will be acceptable for inclusion in the public sources
>      when it is complete.
> 
> 
> P4) Stub for client's OS
> 
>      This project will mostly be non-GDB work.  However, there are some
>      changes to the remote protocol we would like to introduce at this
>      point:
> 
>      The remote protocol presently leaves the process to be debugged
>      implicit; users generally specify it when they start the stub.
>      However, to satisfy our client's requirements, we must be able to
>      connect to a system, list the processes present, and attach to one
>      of them.  This entails making some straightforward extensions to
>      the GDB remote protocol, and thus to GDB as well.
> 
>      The stub for our client should use the 'library' stop reply
>      packets and the 'qXfer:libraries:read' packet to report load
>      module events.  However, because the client's OS may bring each
>      section of a load module into memory at a different offset from
>      the VMA given in the ELF file, we will need to extend the format
>      of the library list the latter packet returns, as it currently
>      assumes that each library needs only one offset, and extend GDB to
>      allow each segment to appear at a different offset (C9).
> 
> 
> P6) Multi-threaded limited-multi-process native debugging
> 
>      This project provides multi-threaded debugging of multiple
>      processes simultaneously.  The debugger stops all threads in all
>      attached processes while the user inspects the state of any
>      thread.  This work is independent of P1; we combine P1 and P6 in
>      the next project, P7.
> 
>      At the user interface level:
> 
>      - MI shall provide new commands to attach and detach a process;
>        unlike GDB's existing 'attach' and 'detach' commands, the new
>        'attach' command will not require GDB to detach from any
>        currently attached processes.
> 
>      - MI shall provide a command to list all currently attached
>        processes.
> 
>      - MI shall provide a command to list all the threads in a given
>        attached process.
> 
>      - The output of the MI '-thread-info' and
>        '-thread-list-all-threads' commands shall include the process ID
>        of each thread listed.  The process ID shall be a separate MI
>        'result' from the string provided by the
>        'target_extra_thread_info' function, so that Eclipse can access
>        it reliably.
> 
>      - GDB shall stop all threads in all attached processes while
>        interacting with the user.  Attaching to a process shall stop
>        all threads in that process.  Detaching from a process shall
>        allow its threads to run again.
> 
>      - MI shall report faults encountered by threads in any attached
>        process.
> 
>      - MI shall report the termination of any attached process.  After
>        such a report, GDB will no longer be attached to the process.
> 
>      - MI's '-thread-select' command shall be able to select any thread
>        in any attached process.
> 
>      - MI's existing breakpoint commands shall set breakpoints global
>        to all attached processes.
> 
>      To support those facilities, we need the following changes:
> 
>      - GDB shall maintain a table of attached processes.  The remote
>        protocol shall provide packets directing the stub to attach to a
>        new process, and to detach from a currently attached process.
> 
>      - The remote protocol shall carry process IDs as well as thread
>        IDs in stop reply packets, thread selection packets, thread
>        enumeration packets, and wherever else is appropriate.
> 
>      - The stub shall use the current general thread (as given by the
>        'Hg' packet) to determine which process's memory to access, as
>        it does now to determine which thread's registers to access.
>        GDB shall send 'Hg' packets as necessary before memory accesses,
>        as it does now for register accesses.
> 
> 
> P7) Non-stop multi-threaded multi-process native debugging
> 
>      This project allows GDB to attach to multiple processes
>      simultaneously.  This builds on P1 and P6, and addresses C4 and
>      C7.
> 
>      At the user interface level:
> 
>      - MI shall provide a command to stop all the threads in a given
>        process, and a command to resume all stopped threads in a given
>        process.
> 
>      Internally:
> 
>      - The remote protocol shall provide a way to tell the stub to
>        leave other threads running after reporting an event in one
>        thread (non-stop behavior), and a way to tell the stub to stop
>        all threads when reporting an event in one thread (stop-all
>        behavior).
> 
>      - The remote protocol shall allow the stub to respond to
>        commands while threads are running, and to report further thread
>        events after a thread has stopped.  (This addresses C4.)
> 
>      - The remote protocol shall provide ways to stop and start a
>        particular thread, and ways to start and stop all the threads in
>        a given process.  The mechanisms for stopping threads and
>        processes shall allow GDB to behave correctly when a thread
>        stops or a process exits simultaneously with GDB sending the
>        command.
> 
> nathan


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]