[PATCH v3 1/7] gdbserver: hide fork child threads from GDB

Pedro Alves pedro@palves.net
Fri Dec 3 23:30:44 GMT 2021


On 2021-12-01 14:44, Simon Marchi via Gdb-patches wrote:
> From: Simon Marchi <simon.marchi@efficios.com>
> 
> This patch aims at fixing a bug where an inferior is unexpectedly
> created when a fork happens at the same time as another event, and that
> other event is reported to GDB first (and the fork event stays pending
> in GDBserver).  This happens for example when we step a thread and
> another thread forks at the same time.  The bug looks like (if I
> reproduce the included test by hand):
> 
>     (gdb) show detach-on-fork
>     Whether gdb will detach the child of a fork is on.
>     (gdb) show follow-fork-mode
>     Debugger response to a program call of fork or vfork is "parent".
>     (gdb) si
>     [New inferior 2]
>     Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
>     Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
>     Reading symbols from target:/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread...
>     [New Thread 965190.965190]
>     [Switching to Thread 965190.965190]
>     Remote 'g' packet reply is too long (expected 560 bytes, got 816 bytes): ... <long series of bytes>
> 
> The sequence of events leading to the problem is:
> 
>  - We are using the all-stop user-visible mode as well as the
>    synchronous / all-stop variant of the remote protocol
>  - We have two threads, thread A that we single-step and thread B that
>    calls fork at the same time
>  - GDBserver's linux_process_target::wait pulls the "single step
>    complete SIGTRAP" and the "fork" events from the kernel.  It
>    arbitrarily choses one event to report, it happens to be the
>    single-step SIGTRAP.  The fork stays pending in the thread_info.
>  - GDBserver send that SIGTRAP as a stop reply to GDB
>  - While in stop_all_threads, GDB calls update_thread_list, which ends
>    up querying the remote thread list using qXfer:threads:read.
>  - In the reply, GDBserver includes the fork child created as a result
>    of thread B's fork.
>  - GDB-side, the remote target sees the new PID, calls
>    remote_notice_new_inferior, which ends up unexpectedly creating a new
>    inferior, and things go downhill from there.
> 
> The problem here is that as long as GDB did not process the fork event,
> it should pretend the fork child does not exist.  Ultimately, this event
> will be reported, we'll go through follow_fork, and that process will be
> detached.
> 
> The remote target (GDB-side), has some code to remove from the reported
> thread list the threads that are the result of forks not processed by
> GDB yet.  But that only works for fork events that have made their way
> to the remote target (GDB-side), but haven't been consumed by the core
> yet, so are still lingering as pending stop replies in the remote target
> (see remove_new_fork_children in remote.c).  But in our case, the fork
> event hasn't made its way to the GDB-side remote target.  We need to
> implement the same kind of logic GDBserver-side: if there exists a
> thread / inferior that is the result of a fork event GDBserver hasn't
> reported yet, it should exclude that thread / inferior from the reported
> thread list.
> 
> This was actually discussed a while ago, but not implemented AFAIK:
> 
>     https://pi.simark.ca/gdb-patches/1ad9f5a8-d00e-9a26-b0c9-3f4066af5142@redhat.com/#t
>     https://sourceware.org/pipermail/gdb-patches/2016-June/133906.html
> 
> Implementation details-wise, the fix for this is all in GDBserver.  The
> Linux layer of GDBserver already tracks unreported fork parent / child
> relationships using the lwp_info::fork_relative, in order to avoid
> wildcard actions resuming fork childs unknown to GDB.  This information
> needs to be made available to the handle_qxfer_threads_worker function,
> so it can filter the reported threads.  Add a new thread_pending_parent
> target function that allows the Linux target to return the parent of an
> eventual fork child.
> 
> Testing-wise, the test replicates pretty-much the sequence of events
> shown above.  The setup of the test makes it such that the main thread
> is about to fork.  We stepi the other thread, so that the step completes
> very quickly, in a single event.  Meanwhile, the main thread is resumed,
> so very likely has time to call fork.  This means that the bug may not
> reproduce every time (if the main thread does not have time to call
> fork), but it will reproduce more often than not.  The test fails
> without the fix applied on the native-gdbserver and
> native-extended-gdbserver boards.
> 
> At some point I suspected that which thread called fork and which thread
> did the step influenced the order in which the events were reported, and
> therefore the reproducibility of the bug.  So I made the test try  both
> combinations: main thread forks while other thread steps, and vice
> versa.  I'm not sure this is still necessary, but I left it there
> anyway.  It doesn't hurt to test a few more combinations.
> 
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28288
> Change-Id: I2158d5732fc7d7ca06b0eb01f88cf27bf527b990

OK.


More information about the Gdb-patches mailing list