This is the mail archive of the mailing list for the GDB project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] PR threads/20743: Don't attempt to suspend or resume exited threads.

On 12/28/2016 11:37 AM, John Baldwin wrote:
On Wednesday, December 28, 2016 09:07:07 AM Vasil Dimov wrote:
On Tue, Dec 27, 2016 at 13:03:27 -0800, John Baldwin wrote:
I have tried changing fbsd_wait() to return a TARGET_WAITKIND_SPURIOUS
instead of explicitly continuing the process, but that doesn't help, and it
means that the ptid being returned is still T1 in that case.

I'm not sure if I should explicitly be calling delete_exited_threads() in
fbsd_resume() before calling iterate_threads()?  Alternatively, fbsd_resume()
could use ALL_NONEXITED_THREADS() instead of iterate_threads() (it isn't
clear to me which of these is preferred since both are in use).

I added the assertion for my own sanity.  I suspect gdb should never try to
invoke target_resume() with a ptid of an exited thread, but if for some
reason it did the effect on FreeBSD would be a hang since we would suspend
all the other threads and when the process was continued via PT_CONTINUE it
would have nothing to do and would never return from wait().  I'd rather have
gdb fail an assertion in that case rather than hang.


I am not sure if this is related, but since I get a hang I would rather
mention it: with the John's patch (including the assert) gdb does not
emit the "ptrace: No such process" error, but when I attempt to quit,
it hangs:

No, this is a separate bug in the kernel whereby a process doesn't
treat PT_KILL as a detach-like event but incorrectly expects to keep
getting PT_CONTINUE events for a while until it finally exits.  I'm
working on writing up regression/unit tests for PT_KILL and then
fixing the bug.

I think the patch is mainly papering over a bigger problem. My guess is that the native fbsd backend is not doing something it should.

I'd check how linux-nat.c is doing things and then try to confirm the fbsd behavior is sane.

For example, i noticed linux-nat.c has exit_lwp (...) that handles deletion of both thread information and the thread itself (lwp). Even if it is the currently-selected thread, we *will* get the lwp removed from the list of existing lwp's.

It doesn't make sense to keep a thread that has already exitted in the list of threads we are manipulating.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]