This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFA] Fix internal error with 'set debug infrun 1' under high load


On Sun, 24 Mar 2019 22:09:43 +0100
Philippe Waroquiers <philippe.waroquiers@skynet.be> wrote:

> On Sun, 2019-03-24 at 13:50 -0700, Kevin Buettner wrote:
> > Hi Philippe,
> > 
> > There is definitely a bug in this section of code from infrun.c:
> > 
> > 	  else if (ws.kind == TARGET_WAITKIND_THREAD_EXITED
> > 		   || ws.kind == TARGET_WAITKIND_EXITED
> > 		   || ws.kind == TARGET_WAITKIND_SIGNALLED)
> > 	    {
> > 	      if (debug_infrun)
> > 		{
> > 		  ptid_t ptid = ptid_t (ws.value.integer);
> > 
> > 		  fprintf_unfiltered (gdb_stdlog,
> > 				      "infrun: %s exited while "
> > 				      "stopping threads\n",
> > 				      target_pid_to_str (ptid).c_str ());
> > 		}
> > 	    }
> > 
> > This line...
> > 
> > 		  ptid_t ptid = ptid_t (ws.value.integer);
> > 
> > ...doesn't make sense to me since ws.value.integer is supposed to
> > be the exit status for TARGET_WAITKIND_THREAD_EXITED and
> > TARGET_WAITKIND_EXITED.
> > 
> > However, for TARGET_WAITKIND_SIGNALLED, the signal number is in
> > ws.value.sig (which, due to being part of a union occupies some
> > of the same bytes as ws.value.integer).
> > 
> > So trying to find the ptid in that manner makes no sense at all.
> > 
> > I'm guessing that the ptid values are bogus when it does work.
> > 
> > Does it work when you use 
> > 
> > 		  ptid_t ptid = ptid_t (event_pid);
> > 
> > instead?  
> I guess you mean to only print event_ptid.
> 
> Yes, that is working (the proposed patch was printing both
> event_ptid and the ptid derived from ws.value.integer, assuming
> that sometimes ws.value.integer was something relevant).
> 
> Here is the trace I obtain after a few trials under high load:
> infrun: stop_all_threads, pass=0, iterations=0
> infrun:   Thread 0x7ffff7fcfb40 (LWP 3587) not executing
> infrun:   Thread 0x7ffff7310700 (LWP 3632) executing, need stop
> [Thread 0x7ffff7310700 (LWP 3632) exited]
> infrun: target_wait (-1.0.0, status) =
> infrun:   3587.3632.0 [LWP 3632],
> infrun:   status->kind = thread exited, status = 0
> infrun: LWP 3632 exited while stopping threads
> infrun:   Thread 0x7ffff7fcfb40 (LWP 3587) not executing
> infrun: stop_all_threads, pass=1, iterations=1
> infrun:   Thread 0x7ffff7fcfb40 (LWP 3587) not executing
> infrun: stop_all_threads done
> 
> The above is obtained with the patch:
> 
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index ad7892105a..7f1339a917 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -4365,12 +4365,10 @@ stop_all_threads (void)
>             {
>               if (debug_infrun)
>                 {
> -                 ptid_t ptid = ptid_t (ws.value.integer);
> -
>                   fprintf_unfiltered (gdb_stdlog,
>                                       "infrun: %s exited while "
>                                       "stopping threads\n",
> -                                     target_pid_to_str (ptid).c_str ());
> +                                     target_pid_to_str (event_ptid).c_str ());
>                 }
>             }
>           else
> 

You make a good point about trying to make use of ws.value.integer.

So, here are my suggestions:

1) Move TARGET_WAITKIND_SIGNALLED into another "else if" clause.  It
doesn't make sense for the debug message to indicate that the process
has exited when it's actually been signalled.

2) Make the TARGET_WAITKIND_THREAD_EXITED / TARGET_WAITKIND_EXITED
case print the exit status and make the TARGET_WAITKIND_SIGNALLED case
print the signal.  These are available (respectively) in ws.value.integer and
ws.value.sig.

Kevin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]