RFC: nptl threading patch for linux
J. Johnston
jjohnstn@redhat.com
Fri May 9 23:38:00 GMT 2003
Daniel Jacobowitz wrote:
> On Thu, Apr 24, 2003 at 04:52:04PM -0400, J. Johnston wrote:
>
>>The following is the last part of my revised nptl patch that has
>>been broken up per Daniel J.'s suggestion. There are no generated
>>files included in the patch.
>
>
> Well, this patch doesn't work for me :( Using 2.5.69, since I don't
> have any of the Red Hat kernels available here at the moment. It looks
> like GDB bellies up around the second thread creation.
>
Is this one of the gdb.threads testcases? If not, do any of those run
for you and/or can you send me a testcase for the problem below so we can at least
have something common to compare?
-- Jeff J.
> A backtrace looks like:
> #0 0xffffe402 in ?? ()
> #1 0x080e1332 in stop_wait_callback (lp=0x0, data=0xbffff450)
> at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:708
> #2 0x080e159a in stop_wait_callback (lp=0x0, data=0xbffff450)
> at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:870
> #3 0x080e159a in stop_wait_callback (lp=0x0, data=0xbffff450)
> at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:870
> #4 0x080e159a in stop_wait_callback (lp=0x0, data=0xbffff450)
> at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:870
>
> And that's not just the stack unwinder getting confused. We really did
> recurse until we ran out of stack.
>
> The superficial reason is this:
> SWC: Pending event Segmentation Fault (stopped) in LWP 4490
>
> i.e. every time we resume it with no signal it SIGSEGV's again, and we
> never get the SIGSTOP.
>
> Here's some more of the log:
> (gdb) c
> Continuing.
> LLR: PTRACE_SINGLESTEP process 4498, 0 (resume event thread)
> LLW: waitpid 4498 received Trace/breakpoint trap (stopped)
> LLTA: PTRACE_PEEKUSER LWP 4498, 0, 0 (OK)
> LLW: Candidate event Trace/breakpoint trap (stopped) in LWP 4498.
> SEL: Select single-step LWP 4498
> LLW: trap_ptid is LWP 4498.
> RC: PTRACE_CONT LWP 4497, 0, 0 (resume sibling)
> LLR: PTRACE_CONT process 4498, 0 (resume event thread)
> LLW: waitpid 4497 received Trace/breakpoint trap (stopped)
> LLTA: PTRACE_PEEKUSER LWP 4497, 0, 0 (OK)
> LLW: Candidate event Trace/breakpoint trap (stopped) in LWP 4497.
> SC: kill LWP 4498 **<SIGSTOP>**
> SC: lwp kill 0 ERRNO-OK
> SWC: waitpid LWP 4498 received Stopped (signal) (stopped)
> LLTA: PTRACE_PEEKUSER LWP 4498, 0, 0 (OK)
> LLW: trap_ptid is LWP 4497.
> [New Thread 1077276112 (LWP 4499)]
> LLAL: PTRACE_ATTACH LWP 4499, 0, 0 (OK)
> LLAL: waitpid LWP 4499 received Stopped (signal) (stopped)
> LLR: PTRACE_SINGLESTEP process 4497, 0 (resume event thread)
> LLW: waitpid 4497 received Trace/breakpoint trap (stopped)
> LLTA: PTRACE_PEEKUSER LWP 4497, 0, 0 (OK)
> LLW: Candidate event Trace/breakpoint trap (stopped) in LWP 4497.
> SEL: Select single-step LWP 4497
> LLW: trap_ptid is LWP 4497.
> RC: PTRACE_CONT LWP 4499, 0, 0 (resume sibling)
> RC: PTRACE_CONT LWP 4498, 0, 0 (resume sibling)
> LLR: PTRACE_CONT process 4497, 0 (resume event thread)
> LLW: waitpid 4499 received Trace/breakpoint trap (stopped)
> LLTA: PTRACE_PEEKUSER LWP 4499, 0, 0 (OK)
> LLW: Candidate event Trace/breakpoint trap (stopped) in LWP 4499.
> SC: kill LWP 4498 **<SIGSTOP>**
> SC: lwp kill 0 ERRNO-OK
> SC: kill LWP 4497 **<SIGSTOP>**
> SC: lwp kill 0 ERRNO-OK
> SWC: waitpid LWP 4498 received Stopped (signal) (stopped)
> LLTA: PTRACE_PEEKUSER LWP 4498, 0, 0 (OK)
> SWC: waitpid LWP 4497 received Trace/breakpoint trap (stopped)
> LLTA: PTRACE_PEEKUSER LWP 4497, 0, 0 (OK)
> PTRACE_CONT LWP 4497, 0, 0 (OK)
> SWC: Candidate SIGTRAP event in LWP 4497
> SWC: waitpid LWP 4497 received Trace/breakpoint trap (stopped)
> LLTA: PTRACE_PEEKUSER LWP 4497, 0, 0 (OK)
> PTRACE_CONT LWP 4497, 0, 0 (OK)
> SWC: Candidate SIGTRAP event in LWP 4497
> SWC: waitpid LWP 4497 received Segmentation fault (stopped)
> LLTA: PTRACE_PEEKUSER LWP 4497, 0, 0 (OK)
> SWC: Pending event Segmentation fault (stopped) in LWP 4497
> SWC: PTRACE_CONT LWP 4497, 0, 0 (OK)
> SWC: waitpid LWP 4497 received Segmentation fault (stopped)
> LLTA: PTRACE_PEEKUSER LWP 4497, 0, 0 (OK)
>
>
> A little interpretation: 4497 hits the creation breakpoint. We atach
> to 4499. 4499 hits the common_routine breakpoint. We stop 4497. It
> hits the breakpoint at thread creation again for the next thread. We
> PTRACE_CONT 4497 again trying to get the SIGSTOP, and get another
> SIGTRAP - probably we were backed up from the breakpoint last time so
> we hit it again. We try _again_, and SIGSEGV because we're on the
> second byte of a multi-byte instruction, the first byte having been
> replaced by a breakpoint.
>
> Life explodes.
>
>
> So:
> - stop_wait_callback should be fixed to not be so dumb when this
> happens.
> - we need to figure out how we got into this mess.
> - and why the SIGSTOP never showed up.
>
> I avoid this entire foul issue in gdbserver by not backtracking and
> resuming the application; instead I just set a flag marking the next
> SIGSTOP as "expected". It's still not perfect but it's a great deal
> better. I can do even better when I have some time to play with
> PTRACE_GETSIGINFO.
>
> I'm waiting for GDB to tell me how we got here. The backtrace is more
> than 40K frames, since I forgot to shrink the stack limit. 50K...
> 170K... ooh!
>
> #174697 0x080e1724 in stop_wait_callback (lp=0x0, data=0xbffff450)
> at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:830
> #174698 0x080e033d in iterate_over_lwps (callback=0x80e12d0 <stop_wait_callback>, data=0x1181)
> at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:293
> #174699 0x080e251e in lin_lwp_wait (ptid={pid = -1, lwp = 0, tid = 0}, ourstatus=0x72)
> at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:1499
> #174700 0x08128ca3 in thread_db_wait (ptid={pid = -1, lwp = 0, tid = 0}, ourstatus=0xffffffff)
> at /opt/src/gdb/src-gdblinks/gdb/thread-db.c:846
> #174701 0x080bc19e in wait_for_inferior () at /opt/src/gdb/src-gdblinks/gdb/infrun.c:1003
> #174702 0x080bbf13 in proceed (addr=3221222720, siggnal=144, step=0)
> at /opt/src/gdb/src-gdblinks/gdb/infrun.c:814
> #174703 0x080b8fb0 in continue_command (proc_count_exp=0x0, from_tty=1)
> at /opt/src/gdb/src-gdblinks/gdb/infcmd.c:539
>
> It wasn't worth the wait. That didn't help much.
>
>
More information about the Gdb-patches
mailing list