This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [PATCH 00/16 v3] Linux extended-remote fork and exec events
- From: Pedro Alves <palves at redhat dot com>
- To: Don Breazeal <donb at codesourcery dot com>, gdb-patches at sourceware dot org
- Date: Thu, 13 Nov 2014 14:58:48 +0000
- Subject: Re: [PATCH 00/16 v3] Linux extended-remote fork and exec events
- Authentication-results: sourceware.org; auth=none
- References: <1408580964-27916-1-git-send-email-donb at codesourcery dot com> <1414798134-11536-1-git-send-email-donb at codesourcery dot com> <5464B44B dot 3090406 at redhat dot com> <5464B6A0 dot 9020602 at redhat dot com>
On 11/13/2014 01:48 PM, Pedro Alves wrote:
> On 11/13/2014 01:38 PM, Pedro Alves wrote:
>> On 10/31/2014 11:28 PM, Don Breazeal wrote:
>>>
>>> - gdb.threads/thread-execl.exp gives a couple of failures related to
>>> scheduler locking. As with the previous item, after spending some
>>> time on this I concluded that pursuing it further now would be
>>> feature-creep, and that this should be tracked with a bug report.
>>
>> Do you have more details on this?
>>
>> Looking at the exec race you mentioned, I thought that thread-execl.exp should
>> expose it, given that the point of the test is exactly a thread other than
>> the main thread execing. But then I stumbled on the fact that running it with
>> your series on top of currently mainline often crashes gdb:
>>
>> $ make check RUNTESTFLAGS="--target_board=native-extended-gdbserver thread-execl.exp"
>> ...
>> Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.threads/thread-execl.exp ...
>> ERROR: Process no longer exists
>>
>> === gdb Summary ===
>>
>> # of expected passes 9
>> # of unresolved testcases 1
>>
>> Odd that this doesn't trigger with native testing.
>
> Hmm, here's what valgrind shows (against gdbserver):
> $ valgrind ./gdb -data-directory=data-directory ./testsuite/gdb.threads/thread-execl -ex "tar extended-remote :9999" -ex "b thread_execler" -ex "c" -ex "set scheduler-locking on"
> ...
> Breakpoint 1, thread_execler (arg=0x0) at /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.threads/thread-execl.c:29
> 29 if (execl (image, image, NULL) == -1)
> (gdb) n
> Thread 32509.32509 is executing new program: /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/thread-execl
> [New Thread 32509.32532]
> ==32510== Invalid read of size 4
> ==32510== at 0x5AA7D8: delete_breakpoint (breakpoint.c:13989)
> ==32510== by 0x6285D3: delete_thread_breakpoint (thread.c:100)
> ==32510== by 0x628603: delete_step_resume_breakpoint (thread.c:109)
> ==32510== by 0x61622B: delete_thread_infrun_breakpoints (infrun.c:2928)
> ==32510== by 0x6162EF: for_each_just_stopped_thread (infrun.c:2958)
> ==32510== by 0x616311: delete_just_stopped_threads_infrun_breakpoints (infrun.c:2969)
> ==32510== by 0x616C96: fetch_inferior_event (infrun.c:3267)
> ==32510== by 0x63A2DE: inferior_event_handler (inf-loop.c:57)
> ==32510== by 0x4E0E56: remote_async_serial_handler (remote.c:11877)
> ==32510== by 0x4AF620: run_async_handler_and_reschedule (ser-base.c:137)
> ==32510== by 0x4AF6F0: fd_event (ser-base.c:182)
> ==32510== by 0x63806D: handle_file_event (event-loop.c:762)
> ==32510== Address 0xcf333e0 is 16 bytes inside a block of size 200 free'd
> ==32510== at 0x4A07577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==32510== by 0x77CB74: xfree (common-utils.c:98)
> ==32510== by 0x5AA954: delete_breakpoint (breakpoint.c:14056)
> ==32510== by 0x5988BD: update_breakpoints_after_exec (breakpoint.c:3765)
> ==32510== by 0x61360F: follow_exec (infrun.c:1091)
> ==32510== by 0x6186FA: handle_inferior_event (infrun.c:4061)
> ==32510== by 0x616C55: fetch_inferior_event (infrun.c:3261)
> ==32510== by 0x63A2DE: inferior_event_handler (inf-loop.c:57)
> ==32510== by 0x4E0E56: remote_async_serial_handler (remote.c:11877)
> ==32510== by 0x4AF620: run_async_handler_and_reschedule (ser-base.c:137)
> ==32510== by 0x4AF6F0: fd_event (ser-base.c:182)
> ==32510== by 0x63806D: handle_file_event (event-loop.c:762)
> ==32510==
> [Switching to Thread 32509.32532]
>
> Breakpoint 1, thread_execler (arg=0x0) at /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.threads/thread-execl.c:29
> 29 if (execl (image, image, NULL) == -1)
> (gdb)
Ah. The breakpoint in question is the step-resume breakpoint
of the non-main thread, the one we "nexted".
And the issue is that with native debugging, the target deletes
all threads from GDB's list _before_ the exec event is reported:
...
infrun: stop_pc = 0x400640
infrun: stepped into subroutine
infrun: inserting step-resume breakpoint at 0x40076f <<<<<<
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0x7ffff7fc4700 (LWP 555)] at 0x400640
infrun: prepare_to_wait
infrun: target_wait (-1, status) =
infrun: -1 [process -1],
infrun: status->kind = ignore
infrun: TARGET_WAITKIND_IGNORE
infrun: prepare_to_wait
infrun: target_wait (-1, status) =
infrun: -1 [process -1],
infrun: status->kind = ignore
infrun: TARGET_WAITKIND_IGNORE
infrun: prepare_to_wait
[Thread 0x7ffff7fc4700 (LWP 555) exited]
Breakpoint 3, delete_thread (ptid=...) at /home/pedro/gdb/mygit/src/gdb/thread.c:371
371 delete_thread_1 (ptid, 0 /* not silent */);
(top-gdb)
But when remote debugging, there are no thread exit events,
so GDB never deletes the thread that was "nexted". And that
thread still has a dangling reference to the step-resume
breakpoint.
With this hack:
---
gdb/linux-nat.c | 2 ++
gdb/linux-thread-db.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index e81a560..df1d6e7 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -892,7 +892,9 @@ exit_lwp (struct lwp_info *lp)
if (print_thread_events)
printf_unfiltered (_("[%s exited]\n"), target_pid_to_str (lp->ptid));
+#if 0
delete_thread (lp->ptid);
+#endif
}
delete_lwp (lp->ptid);
diff --git a/gdb/linux-thread-db.c b/gdb/linux-thread-db.c
index c49b567..59f8ec1 100644
--- a/gdb/linux-thread-db.c
+++ b/gdb/linux-thread-db.c
@@ -597,6 +597,8 @@ enable_thread_event_reporting (void)
td_err_e err;
struct thread_db_info *info;
+ return;
+
info = get_thread_db_info (ptid_get_pid (inferior_ptid));
/* We cannot use the thread event reporting facility if these
--
1.9.3
We see the same bad references with native debugging:
Breakpoint 1, thread_execler (arg=0x0) at /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.threads/thread-execl.c:29
29 if (execl (image, image, NULL) == -1)
(gdb) n
[Thread 0x7ffff7fc4700 (LWP 28506) exited]
process 24152 is executing new program: /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/thread-execl
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff7fc4700 (LWP 4647)]
==12378== Invalid read of size 4
==12378== at 0x5AA5E8: delete_breakpoint (breakpoint.c:13989)
==12378== by 0x6283E3: delete_thread_breakpoint (thread.c:100)
==12378== by 0x628413: delete_step_resume_breakpoint (thread.c:109)
==12378== by 0x61603B: delete_thread_infrun_breakpoints (infrun.c:2928)
==12378== by 0x6160FF: for_each_just_stopped_thread (infrun.c:2958)
==12378== by 0x616121: delete_just_stopped_threads_infrun_breakpoints (infrun.c:2969)
==12378== by 0x616AA6: fetch_inferior_event (infrun.c:3267)
==12378== by 0x63A0EE: inferior_event_handler (inf-loop.c:57)
==12378== by 0x4BF44A: handle_target_event (linux-nat.c:4439)
==12378== by 0x637E7D: handle_file_event (event-loop.c:762)
==12378== by 0x637364: process_event (event-loop.c:339)
==12378== by 0x637406: gdb_do_one_event (event-loop.c:391)
==12378== Address 0xcebf910 is 16 bytes inside a block of size 200 free'd
==12378== at 0x4A07577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==12378== by 0x77C984: xfree (common-utils.c:98)
==12378== by 0x5AA764: delete_breakpoint (breakpoint.c:14056)
==12378== by 0x5986CD: update_breakpoints_after_exec (breakpoint.c:3765)
==12378== by 0x61341F: follow_exec (infrun.c:1091)
==12378== by 0x61850A: handle_inferior_event (infrun.c:4061)
==12378== by 0x616A65: fetch_inferior_event (infrun.c:3261)
==12378== by 0x63A0EE: inferior_event_handler (inf-loop.c:57)
==12378== by 0x4BF44A: handle_target_event (linux-nat.c:4439)
==12378== by 0x637E7D: handle_file_event (event-loop.c:762)
==12378== by 0x637364: process_event (event-loop.c:339)
==12378== by 0x637406: gdb_do_one_event (event-loop.c:391)
==12378==
[Switching to Thread 0x7ffff7fc4700 (LWP 4647)]
Breakpoint 1, thread_execler (arg=0x0) at /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.threads/thread-execl.c:29
29 if (execl (image, image, NULL) == -1)
(gdb)