[RFA] Fix for testsuite errors with gdbserver (remote)

Mon Feb 21 11:50:00 GMT 2011

On Friday 18 February 2011 20:30:15, Tom Tromey wrote:
> Tom> All that python_inferior_exit is doing to provoke the crash is calling
> Tom> get_current_arch.  Under what circumstances is this not safe?  I would
> Tom> have thought -- perhaps naively -- that it was always safe.

You should only call get_current_arch when preparing to run a command, as
default context gdbarch for the user commands (so "global" is really the
interpreter's default gdbarch).
That was part of the whole effort to eliminate the current_gdbarch global.
All other places should use the gdbarch associated with whatever
operation/object the code is doing/manipulating.  E.g., do something
in the current context while the inferior is running, use the current
frame's gdbarch.  Do something related to an obfile, use the obfile
gdbarch, etc.

At the top level, preparing to execute a command, the whole global
state (inferior_ptid, current_inferior, etc.) is in a state that
the core wants to see it in.  Several places in GDB need to
switch inferior_ptid to something else temporarily, at which point
you lose the mapping the core side relies on at most places, between
inferior_ptid, and a thread.  The tear down code paths 
do _not_ and cannot assume there's such a connection.
inferior_ptid will point to a whole process, not a thread.  There
are other places, but mostly at boundaries between the core, and
a target_ops.  Python is probably not being involved at such
times.  I hope, at least.

Now, the python hooks added a new subpath in the teardown path
that decided to query global state at a time the state is
transient, expecting to see it like the top level would.

Looking at the code,

 static void
 python_inferior_exit (struct inferior *inf)
 {
   struct cleanup *cleanup;
   LONGEST exit_code = -1;
   ptid_t ptidp;
   struct target_waitstatus status;

   cleanup = ensure_python_env (get_current_arch (), current_language);

   get_last_target_status (&ptidp, &status);

   exit_code = status.value.integer;

   if (exit_code >= 0
       && emit_exited_event (exit_code) < 0)
     gdbpy_print_stack ();

   do_cleanups (cleanup);
 }

I noted in <http://sourceware.org/ml/gdb/2011-02/msg00108.html>
that the get_current_arch() doesn't have any connection with
the INF argument.  It can be seen in the case at hand: we're
iterating over all inferiors and "exiting" them.  The current
inferior stays the same throughout the whole iteration.

I don't really understand the existence of python_gdbarch,
and why do we need to keep ensuring/preserving it throughout,
instead of using the most appropriate gdbarch whenever one
is necessary.

As I asked on the gdb@ list, what is the gdbarch that the
exited_event callback wants to see as current python_gdbarch?
One related to INF?  That's not where the whole current
context is pointing.  Or to whatever current thread or inferior
was selected at the top level before gdb started handling
the inferior's exit (think non-stop)?  Is it
just making sure the python_gdbarch remains unaltered
by the callback?  Then why not python_gdbarch instead of
get_current_arch()?  That is the "something" that I
was thinking this code may not be doing right.

That all said, it looks like the python api doesn't define
exactly what can be done on an ExitedEvent, and even if
we address the gdbarch issue, if it's intended that the
python user code can do whatever, we're likely to see python/gdb
calls there that try to do something that calls has_stack_frames,
and that will internal error too.  On that ground, I do think
we should put in Pierre's patch, as I'm not seeing what
possible harm can it do, and that gets things working again.
But I'll note that if the handler is allowed to assume the
whole api is available at this point, and that the current
global state can be used as if in the top-level, we'll
probably see other similar cases triggering.

-- 
Pedro Alves