Solaris - procfs: couldn't find pid 32748 (kernel thread 21) in procinfo list

Pedro Alves palves@redhat.com
Tue Jun 2 14:53:38 GMT 2020


On 6/2/20 8:32 AM, Petr Sumbera via Gdb wrote:
> On 01.06.2020 21:12, Pedro Alves wrote:
>> On 6/1/20 12:39 PM, Petr Sumbera via Gdb wrote:
>>> The issue seems to be that the LWP exits and the status->kind is set to TARGET_WAITKIND_SPURIOUS:
>>>
>>> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/procfs.c;h=f6c6b0e71c16224d3e7345ca09e011cdcf06349a;hb=HEAD#l2214
>>>
>>> But instantly it's added into the list again here:
>>>
>>> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/infrun.c;h=95fc3bfe45930b53c33cb4de165db9c070449ad8;hb=HEAD#l5200
>>>
>>> But there is no longer such LWP in /proc.
>>>
>>> Any suggestion?
> 
> Thanks for looking at it!
> 
>> Either:
>>
>> - replace TARGET_WAITKIND_SPURIOUS with TARGET_WAITKIND_THREAD_EXITED, or,
> 
> With this I'm getting:
> 
> [LWP    21         exited]
> [LWP    21         exited]
> /builds/psumbera/userland-gdb-procinfo/components/gdb/gdb-9.2/gdb/thread.c:459: internal-error: void delete_thread_1(thread_info*, bool): Assertion `thr != nullptr' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> 
>> - replace
>>      status->kind = TARGET_WAITKIND_SPURIOUS;
>>      return retval;
>>    with
>>      goto wait_again;
>>    instead.
> 
> and with this:
> 
> [LWP    20         exited]
> [LWP    20         exited]
> /builds/psumbera/userland-gdb-procinfo/components/gdb/gdb-9.2/gdb/thread.c:459: internal-error: void delete_thread_1(thread_info*, bool): Assertion `thr != nullptr' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> 
> -- 
> 
> Note that in both cases there are TWO exits for one LWP. But LWP numbers differ.

You mean, it was 21 in one run, and 20 in another run?
Those were two different runs, and some timing difference
probably tweaked the order of which thread exits first or
something.  Doesn't seem unusual.

Sounds like the patch below would fix it.  

But why do we get two exits in a row for each LWP?  Oh, I guess
once for PR_SYSENTRY of the exit syscall, and another time for
PR_SYSEXIT.

>From 0be6c82e754dd676e9f1259ab0f9a7849d985ffd Mon Sep 17 00:00:00 2001
From: Pedro Alves <pedro@palves.net>
Date: Tue, 2 Jun 2020 15:44:54 +0100
Subject: [PATCH] fix-solaris

---
 gdb/procfs.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gdb/procfs.c b/gdb/procfs.c
index f6c6b0e71c1..e2042f3edc4 100644
--- a/gdb/procfs.c
+++ b/gdb/procfs.c
@@ -2331,9 +2331,10 @@ procfs_target::wait (ptid_t ptid, struct target_waitstatus *status,
 		    if (print_thread_events)
 		      printf_unfiltered (_("[%s exited]\n"),
 					 target_pid_to_str (retval).c_str ());
-		    delete_thread (find_thread_ptid (this, retval));
-		    status->kind = TARGET_WAITKIND_SPURIOUS;
-		    return retval;
+		    thread_info *thr = find_thread_ptid (this, retval);
+		    if (thr != nullptr)
+		      delete_thread (thr);
+		    goto wait_again;
 		  }
 		else if (0)
 		  {

base-commit: f6eee2d098049afd18f90b8f4bb6a5d1a49d900c
-- 
2.14.5



More information about the Gdb mailing list