[PATCH] Fix-for-multiple-thread-detection-in-AIX.patch

Aditya Kamath1 Aditya.Kamath1@ibm.com
Fri Jul 29 09:23:55 GMT 2022


Hi all,

I thank you for the feedback that was given. It was a nice insight.

Please find attached the new patch. [See Fix-for-multiple-thread-detection-in-AIX.patch ]

However, there are a few things that we had to look in further to what was suggested.

Here are my explanations to what was suggested.

> I still think the proposed fix isn't really ideal.  Can you instead
> try to *temporarily* (i.e. using a scoped_restore) set up inferior_ptid
> in pd_activate() before calling pthdb_session_init(), with a comment
> explaining that this is needed for the callbacks?

This is a nice idea Ulrich and Simon. However, let me take a case of a program creating 2 threads plus OfCourse having a main thread. Let's say the program creates the first thread. This solution works fantastic.

So, what is the problem with it??  We have our pd_active set to 1 in pd_activate(). So, the next time we get into the wait() of aix-thread.c on an event of a new thread, what happens is since pthread debug library is initialised we need not get into pd_activate() again to initialise. Therefore, this condition :


if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED

      && status->sig () == GDB_SIGNAL_TRAP)

was made... We directly go to the pd_update().. Since the sync_threadlists() also have pthread debug library functions and our current thread is also null,  we end up syncing threadlists with null thread which means our debugger will not reflect the new threads at all or if it does we will get an assertion saying "Assertion `ecs->event_thread->control.trap_expected' failed" simply because only the thread which is allowed to step should create a trap and we on a creation of new thread said to the gdb core that null thread is the one who raised the event of new thread creation and hence the trap..

So, what can be the solution?? It is great if we create a scope and temporarily switch our thread just before you pthdb_session init() in pd_activate() whicj takes care of session initialisation and just before the sync_threadlists() in pd_activate() post which sync threadlists() can give us the right thread who caused an event to the gdb core ..

Kindly see the patch for the same with comments and inferior_ptid.pid () space correction is also made which I needed to as per Simon.

Let me know what you think, if not let's push this so that AIX folks can debug with multiple threads.
----------------------------------------------------------------------------------------------

>>To avoid this kind of problems, you can temporarily
>>switch thread (using scoped_restore_current_thread + switch_to_thread),
>>which will set all the current stuff mentioned above.  But sometimes
>>this isn't possible, especially in thw wait method, because there isn't
>>always a thread_info for the ptid you are handling yet, so you can't
>>switch to it.

Since you all are more experienced than me, I am sure the future issues and solutions will be brightly more visible to all of you than me and I would love to learn that.. Having said that let me assume you might be thinking of a fork() event where in case we return the child process ID and we switch_to_thread(current_target, child_ptid) we might get an assertion saying inf->thread does not exist and rightly so.. That is where the APIs or functions like ourstatus->set_forked(child_ptid) come in picture where we can pass a new process info and then return a parent process ptid who has a thread from beneath->wait to aix_thread:wait() and that way we won't face this issue of having an inferior with no thread when we use switch_to_thread(current_target,ptid)  in AIX for the time being at least..

Hopefully we are thinking in the same terms and the solution for multiple threads is fair.

Have a nice day ahead.

Thank you,

Regards,
Aditya.







________________________________
From: Simon Marchi <simark@simark.ca>
Sent: 25 July 2022 21:00
To: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>; Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; simon.marchi@efficios.com <simon.marchi@efficios.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Subject: [EXTERNAL] Re: [PATCH] Fix-for-multiple-thread-detection-in-AIX.patch



On 2022-07-25 08:21, Ulrich Weigand wrote:
>
> Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>
>> The cause of the bug :- Since, for the GDB core we are
>> switch_to_no_thread() i.e. we do not have a thread till we return the
>> pid from the wait() there is no thread. So, when a call is made from
>> pd_activate() in wait() of aix-thread.c, to pthdb_session_init() we are
>> going to recieve PTHDB_NOT_THREADED.
>
> Thanks for the explanation.  I wasn't aware the use of
> inferior_ptid happens in some of callback routines used
> by the pthdb_session_init() library routine.

Thanks, me neither, but it makes sense.

> I still think the proposed fix isn't really ideal.  Can you instead
> try to *temporarily* (i.e. using a scoped_restore) set up inferior_ptid
> in pd_activate() before calling pthdb_session_init(), with a comment
> explaining that this is needed for the callbacks?

That sounds like a good idea, this way, from the point of view of the
caller of pd_activate, pd_activate does not care about global state.
That's how we can do baby steps towards relying less on global state
implicitly.  The next step could be to try to make each individual
callback switch to the right global context, based on what they need.

You just need to be careful, some parts of GDB expect inferior_ptid, the
current thread, the current inferior and the current program space to be
sync'ed.  If you just set inferior_ptid,  you need to make sure to only
call functions that use inferior_ptid, not the other current stuff.
There is not practical way to know this, you have to carefully inspect
what is called.  To avoid this kind of problems, you can temporarily
switch thread (using scoped_restore_current_thread + switch_to_thread),
which will set all the current stuff mentioned above.  But sometimes
this isn't possible, especially in thw wait method, because there isn't
always a thread_info for the ptid you are handling yet, so you can't
switch to it.

Given the AIX target only supports one inferior for now, the current
inferior and program space should be correct.  But to support
multi-inferior, it will be important to keep that in mind.  You might
have to switch to the right inferior in addition to setting
inferior_ptid in pd_acticate.

This hunk in the patch:

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index 4c9195a7f12..91466a17647 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -976,7 +976,7 @@ pd_enable (void)
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
      activate thread debugging.  */
-  pd_activate (1);
+  pd_activate (inferior_ptid.pid());
 }

looks right to me (except the missing space before the parenthesis).  It
looks like an oversight in my "gdb: fix
{rs6000_nat_target,aix_thread_target}::wait to not use inferior_ptid"
patch, I forgot to update that call to pd_activate.  Note that the old
parameter to pd_activate was SET_INFPID, and if set, pd_update would
change the current thread to reflect the thread ptid, if thread
debugging was enabled.  The current code no longer does that.  If that
was important, we can re-introduce it here: make pd_enable switch to the
thread with the ptid returned by pd_activate.

Simon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Fix-for-multiple-thread-detection-in-AIX.patch
Type: application/octet-stream
Size: 2076 bytes
Desc: 0001-Fix-for-multiple-thread-detection-in-AIX.patch
URL: <https://sourceware.org/pipermail/gdb-patches/attachments/20220729/8ae3caab/attachment.obj>


More information about the Gdb-patches mailing list