Towards multiprocess GDB

Fri Jul 18 20:50:00 GMT 2008

CodeSourcery has a project to add "multiprocess" capability to GDB,
and with this message I'd like to kick off some discussion of what
that means and how to make it happen.

To put it simply, the goal of the project is to make this command work
in some useful way:

  gdb prog1 prog2 pid2 prog3 prog4

As the command suggests, we're talking about multiple programs or
executables being controlled by a single GDB, in contrast to a single
program with multiple processes or forks, a la Michael's machinery for
Linux forks. So although we often use the term "multiprocess", it's
perhaps more precise to call it "multiprogram" or "multiexec" GDB.

The first thing is to figure out is how this should all work for the GDB
user. The command above seems like a pretty obvious extension;
programmers debugging client/server pairs have long wanted to be able
to do just that, and we've always had to tell them they have to start
up two GDBs and juggle. Since both core files and process ids are
distinguishable from executable names, we can allow intermixing of all
these on the command line, so that multiple core file and pid
arguments apply to the preceding executables.

Once the debugger is started, but before any of the programs have run,
it seems obvious to have a notion of "current program", so that commands
like "list main" and "break main" will work as usual. The user should
have a way to list the programs ("info programs") and a way to set the
current one. It might also make sense to have a menu option a la C++,
so that "list client_only_fn" works irrespective of the current program,
while "list main" might ask the user which main() is wanted. Another
possibility is to introduce a "program apply <names>|all <cmd>" on the
analogy of the existing thread apply command.

Notice that we don't really want to use the term "process" for any of
this so far, because nothing is running yet and there are no
processes/inferiors; this part is all about the symbol side.

Commands like "file" should get an alternate form or behavior that
adds rather than replaces. Conversely, the user will also want a way
to take programs out of the debugging session. This is not quite the
same as detaching, because the user may want, say, the server to
continue doing its serving thing, but also to have the list etc
commands only work on the client code, no longer be ambiguous.

When it's time to run, the user will want the ability to run anywhere
from one to all of the programs, each with its own argument list. It
should be possible to do this with a single command, so that the user
isn't scrambling to put in all the run commands quickly enough.

Once programs are running, execution control should work in a fashion
generally analogous to what we have for threads now. When something is
stopped, it needs to report program, process, and thread; step and
continue will need a way to specify the program or process being
stepped or continued. User-friendliness suggests that program name
should be accepted as a synonym for process id if there is only one
process for the executable.

Data display will need to have some way to identify the process from
which the data is being taken. It may be useful to have a "process
apply" for print commands, so that the output includes the value of
the expression in each process, especially useful for values that are
expected to be the same in each.

Implementationwise, we will need to replace the single exec target
with a list of execs, and modify symbol machinery to support a
many-many relationship between programs and symbol tables. Although
my inclination is to create a new symbol table for each process'
image of each shared library, that may be excessively expensive.

In addition to thoughts on desired user interface, I would welcome
suggestions on how to add this feature incrementally; the abovementioned
bits are a lot to add all at once!

Stan