This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Hello, I have noticed the following problem, when debugging a program which uses both threads and fork. The program is attached in copy, and it was compiled by simply doing: % gnatmake -g a_test The issue appears only randomly, but it seems to show up fairly reliably when using certain versions of GNU/Linux such as RHES7, or WRSLinux. I also see it on Ubuntu, but less reliably. Here is what I have found, debugging on WRSLinux (we set it up as a cross, but it should be the same with native GNU/Linux distros): % gdb a_test (gdb) break a_test.adb:30 (gdb) break a_test.adb:39 (gdb) target remote my_board:4444 (gdb) continue Continuing. [...] [New Thread 866.868] [New Thread 866.869] [New Thread 870.870] /[...]/gdb/thread.c:89: internal-error: thread_info* inferior_thread(): Assertion `tp' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) The error happens because GDBserver returns a list of threads to GDB where a new thread as a different PID (870 in the case above, instead of 866). What this does is that it makes remote_notice_new_inferior think that there is a new inferior (which is actually true, in fact), thus causing it to call remote_add_inferior, which does the following: /* In the traditional debugging scenario, there's a 1-1 match between program/address spaces. We simply bind the inferior to the program space's address space. */ inf = current_inferior (); inferior_appeared (inf, pid); These two lines cause the PID of the current inferior, to be changed to the PID of the new process (from the fork). This is where I *think* we're making the mistake; see below... However, remote_notice_new_inferior also calls notice_new_inferior a few lines later, which starts by setting up a cleanup to restoreu the current thread: if (!ptid_equal (inferior_ptid, null_ptid)) make_cleanup_restore_current_thread (); At that point in time, the current thread, which is the thread that receveived the event, is one of the thread belonging to the original inferior (pid=866). That's what we setup to restore. And unfortunately, the restoration does not go according to plan, because our inferior list now still has one inferior in it, except that its PID is no longer 866, but rather 870. If we look at thread.c::do_restore_current_thread_cleanup, we see: tp = find_thread_ptid (old->inferior_ptid); /* If the previously selected thread belonged to a process that has in the mean time been deleted (due to normal exit, detach, etc.), then don't revert back to it, but instead simply drop back to no thread selected. */ if (tp && find_inferior_ptid (tp->ptid) != NULL) restore_current_thread (old->inferior_ptid); else { restore_current_thread (null_ptid); set_current_inferior (find_inferior_id (old->inf_id)); } In our case, find_inferior_ptid no longer finds an inferior with the old PID, and so we go into the else branch, causing us to set the inferior_ptid to the null_ptid. This causes problems a little later, when doing a normal_stop: /* Notify observers about the stop. This is where the interpreters print the stop event. */ if (!ptid_equal (inferior_ptid, null_ptid)) observer_notify_normal_stop (inferior_thread ()->control.stop_bpstat, stop_print_frame); else observer_notify_normal_stop (NULL, stop_print_frame); In our case, we're in the "else" branch. This leads to cli_on_normal_stop, which calls print_stop_event -> print_stop_location, which starts by calling inferior_thread: struct thread_info *tp = inferior_thread (); And looking at inferior_thread, we see: struct thread_info *tp = find_thread_ptid (inferior_ptid); gdb_assert (tp); We trip the assertion before inferior_ptid is the null_ptid. At first sight, I think that the main problem is that we muck the current_inferior's pid when we really shouldn't. I'm not really sure how the new PID should be handled though, which is why I'm asking for advice here. I think it also unearthed a secondary issue - looks like normal_stop really isn't prepared to handle a null inferior_ptid, even though the fact that we call it after having checked that inferior_ptid is null indicates that we should. But what does it mean, to be showing where we stopped, when we don't know which thread caused the stop??? I think discussing this separately would be best, but I wanted to mention it here, so it doesn't get overlooked. Any advice on how I should be fixing the issue? Thanks! -- Joel
Attachment:
a_test.adb
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |