Bug 3231

Summary: PTRACE_DETACH doesn't deliver signals under utrace.
Product: frysk Reporter: Chris Moller <cmoller>
Component: generalAssignee: Andrew Cagney <cagney>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Bug Depends on:    
Bug Blocks: 1496    
Attachments: C testcase that demonstrates the bug.
Simplified C testcase that demonstrates the bug.

Description Chris Moller 2006-09-20 13:22:58 UTC
ptrace (PTRACE_DETACH, pid, NULL, sig) should deliver signal "sig" to process
"pid." In kernels with the ptrace compatibility layer over utrace, this doesn't
happen.

The attached C testcase demonstrates this.  Under RHEL4 with standard ptrace, a
PTRACE_DETACH emitting a SIGUSR1 results in:

PTRACE_DETACH with SIGUSR1:     OK
handler
clone caught
clone waiting
PTRACE_DETACH with zero arg -- should fail:     errno = 3: No such process
[etc.]

where the handler, clone caught, and clone waiting, lines result from the
receipt of the SIGUSR1.  (The second PTRACE_DETACH is the next step in the test,
showing that the detach actuall occurred.)

Under FC6 with utrrace, the sequence is:

PTRACE_DETACH with SIGUSR1:     OK
PTRACE_DETACH with zero arg -- should fail:     errno = 3: No such process

I.e., the detach occurs, but the SIGUSR1 is not delivered.

This behaviour is probably the cause of the frysk FC6 frysk-core make check
failures that rely on a SIGKILL being delivered with PTRACE_DETACH.
Comment 1 Chris Moller 2006-09-20 13:27:46 UTC
Created attachment 1308 [details]
C testcase that demonstrates the bug.

Under utrace (FC6), run with ./ptrace-test -f to show the failing ptrace
behaviour.  (Use -p to show a workaround method.)  Under RHEL4 without utrace,
both -f and -p will result in passes using different methods.
Comment 2 Chris Moller 2006-09-21 19:14:52 UTC
Created attachment 1312 [details]
Simplified C testcase that demonstrates the bug.

_exit(0) on pass, _exit(1) on fail.
Comment 3 Chris Moller 2006-09-22 14:29:28 UTC
This is the same as RHEL5 bug 207674
(https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=207674).
Comment 4 Chris Moller 2006-09-29 19:46:30 UTC
A few additional comments/observations:

Based on a couple days of research, what I think i'm seeing for PTRACE_DETACH is
 that ptrace_induce_signal() calls  utrace_inject_signal() which sets the signo
in the target->utrace struct, but, later when ptrace.c:ptrace_detach() calls
utrace_detach(), which in turn calls remove_engine() and wake_quiescent(), I
don't see how that signo gets communicated to the task.  But this may just be my
unfamiliarity with the code.

Noting that vanilla/kernel/ptrace.c: __ptrace_detach() simply sets
child->exit_code to the intended signr, I tried (along with a few other hacks)
the equivalent thing in utrace.c, but it didn't work (thereby proving I don't
understand the code well enough yet).

I think other ptrace reqs that involve signal delivery are failing the same way,
but I haven't verified that yet. 
Comment 5 Chris Moller 2006-10-09 19:18:24 UTC
The utrace patch Roland and Aris came up with fixes this.
Comment 6 Andrew Cagney 2006-10-10 14:18:13 UTC
Which exact kernel?
Comment 7 Chris Moller 2006-10-10 15:17:02 UTC
2.6.17-1.2678,  the latest FC6 kernel.

Aris and Roland built the patch against a 2.6.18 kernel, which I also tested.