This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [RFA] Fix internal error with 'set debug infrun 1' under high load
On Sun, 24 Mar 2019 22:09:43 +0100
Philippe Waroquiers <philippe.waroquiers@skynet.be> wrote:
> On Sun, 2019-03-24 at 13:50 -0700, Kevin Buettner wrote:
> > Hi Philippe,
> >
> > There is definitely a bug in this section of code from infrun.c:
> >
> > else if (ws.kind == TARGET_WAITKIND_THREAD_EXITED
> > || ws.kind == TARGET_WAITKIND_EXITED
> > || ws.kind == TARGET_WAITKIND_SIGNALLED)
> > {
> > if (debug_infrun)
> > {
> > ptid_t ptid = ptid_t (ws.value.integer);
> >
> > fprintf_unfiltered (gdb_stdlog,
> > "infrun: %s exited while "
> > "stopping threads\n",
> > target_pid_to_str (ptid).c_str ());
> > }
> > }
> >
> > This line...
> >
> > ptid_t ptid = ptid_t (ws.value.integer);
> >
> > ...doesn't make sense to me since ws.value.integer is supposed to
> > be the exit status for TARGET_WAITKIND_THREAD_EXITED and
> > TARGET_WAITKIND_EXITED.
> >
> > However, for TARGET_WAITKIND_SIGNALLED, the signal number is in
> > ws.value.sig (which, due to being part of a union occupies some
> > of the same bytes as ws.value.integer).
> >
> > So trying to find the ptid in that manner makes no sense at all.
> >
> > I'm guessing that the ptid values are bogus when it does work.
> >
> > Does it work when you use
> >
> > ptid_t ptid = ptid_t (event_pid);
> >
> > instead?
> I guess you mean to only print event_ptid.
>
> Yes, that is working (the proposed patch was printing both
> event_ptid and the ptid derived from ws.value.integer, assuming
> that sometimes ws.value.integer was something relevant).
>
> Here is the trace I obtain after a few trials under high load:
> infrun: stop_all_threads, pass=0, iterations=0
> infrun: Thread 0x7ffff7fcfb40 (LWP 3587) not executing
> infrun: Thread 0x7ffff7310700 (LWP 3632) executing, need stop
> [Thread 0x7ffff7310700 (LWP 3632) exited]
> infrun: target_wait (-1.0.0, status) =
> infrun: 3587.3632.0 [LWP 3632],
> infrun: status->kind = thread exited, status = 0
> infrun: LWP 3632 exited while stopping threads
> infrun: Thread 0x7ffff7fcfb40 (LWP 3587) not executing
> infrun: stop_all_threads, pass=1, iterations=1
> infrun: Thread 0x7ffff7fcfb40 (LWP 3587) not executing
> infrun: stop_all_threads done
>
> The above is obtained with the patch:
>
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index ad7892105a..7f1339a917 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -4365,12 +4365,10 @@ stop_all_threads (void)
> {
> if (debug_infrun)
> {
> - ptid_t ptid = ptid_t (ws.value.integer);
> -
> fprintf_unfiltered (gdb_stdlog,
> "infrun: %s exited while "
> "stopping threads\n",
> - target_pid_to_str (ptid).c_str ());
> + target_pid_to_str (event_ptid).c_str ());
> }
> }
> else
>
You make a good point about trying to make use of ws.value.integer.
So, here are my suggestions:
1) Move TARGET_WAITKIND_SIGNALLED into another "else if" clause. It
doesn't make sense for the debug message to indicate that the process
has exited when it's actually been signalled.
2) Make the TARGET_WAITKIND_THREAD_EXITED / TARGET_WAITKIND_EXITED
case print the exit status and make the TARGET_WAITKIND_SIGNALLED case
print the signal. These are available (respectively) in ws.value.integer and
ws.value.sig.
Kevin