This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFA] Factorize killing the children in linux-ptrace.c, and fix a 'process leak'.


On 2018-11-03 4:19 p.m., Philippe Waroquiers wrote:
> Running the gdb testsuite under Valgrind started to fail after 100+ tests,
> due to out of memory caused by lingering processes.
> 
> The lingering processes are caused by the combination
> of a limitation in Valgrind signal handling when using PTRACE_TRACEME
> and a (minor) bug in GDB.
> 
> The Valgrind limitation is : when a process is ptraced and raises
> a signal, Valgrind will replace the raised signal by SIGSTOP as other
> signals are masked by Valgrind when executing a system call.
> Removing this limitation seems far to be trivial, valgrind signal
> handling is very complex.
> 
> Due to this valgrind limitation, GDB linux_ptrace_test_ret_to_nx gets
> a SIGSTOP signal instead of the expected SIGTRAP or SIGSEGV.
> In such a case, linux_ptrace_test_ret_to_nx does an early return, but
> does not kill the child (running under valgrind), child stays in a STOP-ped
> state.
> These lingering processes then eat the available system memory,
> till launching a new process starts to fail.
> 
> This patch fixes the GDB minor bug by killing the child in case
> linux_ptrace_test_ret_to_nx does an early return.
> 
> nat/linux-ptrace.c has 3 different logics to kill a child process.
> So, this patch factorizes killing a child in the function kill_child.
> 
> The 3 different logics are:
> * linux_ptrace_test_ret_to_nx is calling both kill (child, SIGKILL)
>   and ptrace (PTRACE_KILL, child, ...), and then is calling once
>   waitpid.
> * linux_check_ptrace_features is calling ptrace (PTRACE_KILL, child, ...)
>   + my_waitpid in a loop, as long as the waitpid status was WIFSTOPPED.
> * linux_test_for_tracefork is calling once ptrace (PTRACE_KILL, child, ...)
>   + my_waitpid.
> 
> The linux ptrace documentation indicates that PTRACE_KILL is deprecated,
> and tells to not use it, as it might return success but not kill the tracee.
> The documentation indicates to send SIGKILL directly.
> 
> I suspect that linux_ptrace_test_ret_to_nx calls both kill and ptrace just
> to be sure ...
> I suspect that linux_check_ptrace_features calls ptrace in a loop
> to bypass the PTRACE_KILL limitation.
> And it looks like linux_test_for_tracefork does not handle the PTRACE_KILL
> limitation.
> Also, 2 of the 3 logics are calling my_waitpid, which seems better,
> as this is protecting the waitpid syscall against EINTR.
> 
> So, the logic in kill_child is just using kill (child, SIGKILL)
> + my_waitpid, and then does a few verifications to see everything worked
> accordingly to the plan.
> 
> Tested on Debian/x86_64.
> 
> 2018-11-03  Philippe Waroquiers  <philippe.waroquiers@skynet.be>
> 
> 	* nat/linux-ptrace.c (kill_child): New function.
> 	(linux_ptrace_test_ret_to_nx): Use kill_child instead of local code.
> 	Add a call to kill_child in case of early return after fork.
> 	(linux_check_ptrace_features): Use kill_child instead of local code.
> 	(linux_test_for_tracefork): Likewise.

This makes sense to me, but I'd like to invoke pedro(), see if he sees
anything wrong with it.

Simon

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]