This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v3 01/17] Fix and test "checkpoint" in non-stop mode


Pedro Alves writes:
 > Note this is whole checkpoints feature is old code, that predates
 > the multi-inferior support for a long while.  It has many
 > limitations (e.g., relies on fork, so can't use it on multi-threaded
 > programs), but I/we've kept it limping along, at least in
 > the single-inferior case.

Hmmm.  Got it.

 > > Re: checkpoint-ns.exp:
 > > If we're going to have one "make check" involve both all-stop
 > > and non-stop testing (as opposed to doing one "make check" with all-stop
 > > and another one with non-stop), I'd rather see a more table-driven approach
 > > (the details don't matter to me much other than I'd rather have one file
 > > than a proliferation of foo-ns.exp files).
 > > Parallelizing that may involve an extra step, but that's ok.
 > 
 > Yes, it sounds like something we may want to get at at some point,
 > but I can't invest time right now on designing something
 > more elaborate.  I suggest letting this first example go in,
 > and see where it takes us, what patterns start emerging.
 > For example, I realized that we don't really need the extra
 > variable in checkpoint/checkpoint-ns.exp; we can use GDBFLAGS
 > instead, though I'm not sure that will/would be sufficient for
 > other uses.
 > 
 > If I understand your "extra step" correctly, you're suggesting a generated
 > file.  So there's still an extra file anyway?  I just think of the separate
 > file as moving the test's body to a helper "library" instead of a
 > procedure, with the bonus that we can call either variant from the command line,
 > instead of being forced to run both variants in sequence, as in
 > the foreach+procedure way that we've been using in many files.
 > 
 > Whether separate file or foreach, it seems to be that it's the same thing
 > wrt to table-driven vs procedural, so I can't see how an extra file makes
 > things any worse.  I will personally keep tending to pick the "foreach
 > variant" route as first choice, as done in numerous tests now.  It's just
 > that checkpoint.exp is longish in both line number and time it takes to run.

Ah.
The axis on which I'm worried is whether this will turn out to be
1 extra file or 1000. One extra file per test doesn't scale.
And trying to reign in 100 files, or whatever the threshold is
before it's clear that we need to change course, will be more
painful than having a table-driven approach that can be simple
now and still grow as needed. (*1)

Another way to go would be really just have 1 extra file (or 1 extra
file per gdb.foo subdir or some such), and this file
documents all the tests intended to be run in different modes
within one "make check". One could even extend
this to be more than just all-stop/non-stop.

Another way, though, is to have two "make check" runs:
One with all-stop and one with non-stop.
Tests that can't handle one or the other should be marked
as such anyway. How a test chooses to handle the choice
is an internal implementation decision regardless of which
route is chosen. Yeah, running some tests twice this way
won't enhance coverage (e.g., no point in running help.exp in
all-stop and non-stop), but we should be almost within epsilon
of being able to do this today (I know I've done it in the past),
and complete test runs only take a few minutes on any reasonably
beefy system (at least they do on mine).

(*1): A hybrid table driven approach could be to (effectively)
run the test twice using whatever mechanism that running
"make check" twice would use. This way only the specified
set of tests would be run twice (instead of running the whole testsuite
twice), and one can still manually run individual tests in whatever
mode (all-stop/non-stop/whatever) one chooses.

My counter-proposal would be to see the extent to which we can
just run the testsuite in all-stop and non-stop, and do that for now
while we work towards something that scales better.
Someone could get a basic table-driven scheme going in a day.

 > >  > +    {
 > >  > +      ptid_t pid_ptid;
 > >  > +
 > >  > +      /* Some targets still have execution when a process exits.
 > >  > +	 E.g., for "checkpoint", when when a fork exits and is
 > > 
 > > s/when when/when/
 > > 
 > >  > +	 mourned, linux-fork.c switches to another fork.  */
 > >  > +      pid_ptid = pid_to_ptid (ptid_get_pid (ecs->ptid));
 > >  > +      set_executing (pid_ptid, 0);
 > > 
 > > This is a bit confusing. I'm guessing ecs->ptid is for the inferior
 > > that just exited/signalled, but we're not changing pids here.
 > > I'm guessing we'll mourn the inferior later, but IWBN to elaborate
 > > on the comment a little, e.g., to say we always need to call set_executing
 > > here, even if the inferior exited/signalled (which the code didn't
 > > previously do). But it's not clear why "switches to another fork" comes
 > > into play here.
 > 
 > I've changed it this way:
 > 
 > diff --git c/gdb/infrun.c w/gdb/infrun.c
 > index 9c1fdc5..caf1ff5 100644
 > --- c/gdb/infrun.c
 > +++ w/gdb/infrun.c
 > @@ -3798,20 +3798,27 @@ handle_inferior_event (struct execution_control_state *ecs)
 >  
 >    /* Mark the non-executing threads accordingly.  In all-stop, all
 >       threads of all processes are stopped when we get any event
 > -     reported.  In non-stop mode, only the event thread stops.  If
 > -     we're handling a process exit in non-stop mode, there's nothing
 > -     to do, as threads of the dead process are gone, and threads of
 > -     any other process were left running.  */
 > +     reported.  In non-stop mode, only the event thread stops.  */
 >    if (!non_stop)
 >      set_executing (minus_one_ptid, 0);
 >    else if (ecs->ws.kind == TARGET_WAITKIND_SIGNALLED
 > -	   && ecs->ws.kind == TARGET_WAITKIND_EXITED)
 > +	   || ecs->ws.kind == TARGET_WAITKIND_EXITED)
 >      {
 >        ptid_t pid_ptid;
 >  
 > -      /* Some targets still have execution when a process exits.
 > -	 E.g., for "checkpoint", when when a fork exits and is
 > -	 mourned, linux-fork.c switches to another fork.  */
 > +      /* If we're handling a process exit in non-stop mode, even
 > +	 though threads haven't been deleted yet, one would think that
 > +	 there is nothing to do, as threads of the dead process will
 > +	 be soon deleted, and threads of any other process were left
 > +	 running.  However, on some targets, threads survive a process
 > +	 exit event.  E.g., for the "checkpoint" command, when the
 > +	 current checkpoint/fork exits, linux-fork.c automatically
 > +	 switches to another fork from within target_mourn_inferior,
 > +	 by associating the same inferior/thread to another fork.  We
 > +	 haven't mourned yet at this point, but we must mark any
 > +	 threads left in the process as not-executing so that
 > +	 finish_thread_state marks them stopped (in the user's
 > +	 perspective) if/when we present the stop to the user.  */
 >        pid_ptid = pid_to_ptid (ptid_get_pid (ecs->ptid));
 >        set_executing (pid_ptid, 0);
 >      }

Works for me.
Thanks!

 > >From 5932f834e3b4feb2f0771d7bb344ad8f07ed98e1 Mon Sep 17 00:00:00 2001
 > From: Pedro Alves <palves@redhat.com>
 > Date: Wed, 22 Apr 2015 18:25:29 +0100
 > Subject: [PATCH] Fix and test "checkpoint" in non-stop mode
 > 
 > Letting a "checkpoint" run to exit with "set non-stop on" behaves
 > differently compared to the default all-stop mode ("set non-stop
 > off").
 > 
 > Currently, in non-stop mode:
 > 
 >   (gdb) start
 >   Temporary breakpoint 1 at 0x40086b: file src/gdb/testsuite/gdb.base/checkpoint.c, line 28.
 >   Starting program: build/gdb/testsuite/gdb.base/checkpoint
 > 
 >   Temporary breakpoint 1, main () at src/gdb/testsuite/gdb.base/checkpoint.c:28
 >   28        char *tmp = &linebuf[0];
 >   (gdb) checkpoint
 >   checkpoint 1: fork returned pid 24948.
 >   (gdb) c
 >   Continuing.
 >   Copy complete.
 >   Deleting copy.
 >   [Inferior 1 (process 24944) exited normally]
 >   [Switching to process 24948]
 >   (gdb) info threads
 >     Id   Target Id         Frame
 >     1    process 24948 "checkpoint" (running)
 > 
 >   No selected thread.  See `help thread'.
 >   (gdb) c
 >   The program is not being run.
 >   (gdb)
 > 
 > Two issues above:
 > 
 >  1. Thread 1 got stuck in "(running)" state (it isn't really running)
 > 
 >  2. While checkpoints try to preserve the illusion that the thread is
 >     still the same when the process exits, GDB switched to "No thread
 >     selected." instead of staying with thread 1 selected.
 > 
 > Problem #1 is caused by handle_inferior_event and normal_stop not
 > considering that when a
 > TARGET_WAITKIND_SIGNALLED/TARGET_WAITKIND_EXITED event is reported,
 > and the inferior is mourned, the target may still have execution.
 > 
 > Problem #2 is caused by the make_cleanup_restore_current_thread
 > cleanup installed by fetch_inferior_event not being able to find the
 > original thread 1's ptid in the thread list, thus not being able to
 > restore thread 1 as selected thread.  The fix is to make the cleanup
 > installed by make_cleanup_restore_current_thread aware of thread ptid
 > changes, by installing a thread_ptid_changed observer that adjusts the
 > cleanup's data.
 > 
 > After the patch, we get the same in all-stop and non-stop modes:
 > 
 >   (gdb) c
 >   Continuing.
 >   Copy complete.
 >   Deleting copy.
 >   [Inferior 1 (process 25109) exited normally]
 >   [Switching to process 25113]
 >   (gdb) info threads
 >     Id   Target Id         Frame
 >   * 1    process 25113 "checkpoint" main () at src/gdb/testsuite/gdb.base/checkpoint.c:28
 >   (gdb)
 > 
 > Turns out the whole checkpoints.exp file can run in non-stop mode
 > unmodified.  I thought of moving most of the test file's contents to a
 > procedure that can be called twice, once in non-stop mode and another
 > in all-stop mode.  But then, the test already takes over 30 seconds to
 > run on my machine, so I thought it'd be nicer to run all-stop and
 > non-stop mode in parallel.  Thus I added a new checkpoint-ns.exp file
 > that just appends "set non-stop on" to GDBFLAGS and sources
 > checkpoint.exp.
 > 
 > gdb/ChangeLog:
 > 2015-04-22  Pedro Alves  <palves@redhat.com>
 > 
 > 	* infrun.c (handle_inferior_event): If we get
 > 	TARGET_WAITKIND_SIGNALLED or TARGET_WAITKIND_EXITED in non-stop
 > 	mode, mark all threads of the exiting process as not-executing.
 > 	(normal_stop): If we get TARGET_WAITKIND_SIGNALLED or
 > 	TARGET_WAITKIND_EXITED in non-stop mode, finish all threads of the
 > 	exiting process, if inferior_ptid still points at a process.
 > 	* thread.c (struct current_thread_cleanup) <next>: New field.
 > 	(current_thread_cleanup_chain): New global.
 > 	(restore_current_thread_ptid_changed): New function.
 > 	(restore_current_thread_cleanup_dtor): Remove the cleanup from the
 > 	current_thread_cleanup_chain list.
 > 	(make_cleanup_restore_current_thread): Add the cleanup data to the
 > 	current_thread_cleanup_chain list.
 > 	(_initialize_thread): Install restore_current_thread_ptid_changed
 > 	as thread_ptid_changed observer.
 > 
 > gdb/testsuite/ChangeLog:
 > 2015-04-22  Pedro Alves  <palves@redhat.com>
 > 
 > 	* gdb.base/checkpoint-ns.exp: New file.
 > 	* gdb.base/checkpoint.exp: Pass explicit "checkpoint.c" to
 > 	standard_testfile.
 > 
 > v3.1:
 > 
 > 	Extend comments.
 > 	In test, use GDBFLAGS instead of new variable.
 > 
 > v3:
 > 	No changes.
 > ---
 >  gdb/infrun.c                             | 48 ++++++++++++++++++++++++++------
 >  gdb/testsuite/gdb.base/checkpoint-ns.exp | 26 +++++++++++++++++
 >  gdb/testsuite/gdb.base/checkpoint.exp    |  6 ++--
 >  gdb/thread.c                             | 38 +++++++++++++++++++++++++
 >  4 files changed, 107 insertions(+), 11 deletions(-)
 >  create mode 100644 gdb/testsuite/gdb.base/checkpoint-ns.exp
 > 
 > diff --git a/gdb/infrun.c b/gdb/infrun.c
 > index 7870f70..caf1ff5 100644
 > --- a/gdb/infrun.c
 > +++ b/gdb/infrun.c
 > @@ -3798,14 +3798,31 @@ handle_inferior_event (struct execution_control_state *ecs)
 >  
 >    /* Mark the non-executing threads accordingly.  In all-stop, all
 >       threads of all processes are stopped when we get any event
 > -     reported.  In non-stop mode, only the event thread stops.  If
 > -     we're handling a process exit in non-stop mode, there's nothing
 > -     to do, as threads of the dead process are gone, and threads of
 > -     any other process were left running.  */
 > +     reported.  In non-stop mode, only the event thread stops.  */
 >    if (!non_stop)
 >      set_executing (minus_one_ptid, 0);
 > -  else if (ecs->ws.kind != TARGET_WAITKIND_SIGNALLED
 > -	   && ecs->ws.kind != TARGET_WAITKIND_EXITED)
 > +  else if (ecs->ws.kind == TARGET_WAITKIND_SIGNALLED
 > +	   || ecs->ws.kind == TARGET_WAITKIND_EXITED)
 > +    {
 > +      ptid_t pid_ptid;
 > +
 > +      /* If we're handling a process exit in non-stop mode, even
 > +	 though threads haven't been deleted yet, one would think that
 > +	 there is nothing to do, as threads of the dead process will
 > +	 be soon deleted, and threads of any other process were left
 > +	 running.  However, on some targets, threads survive a process
 > +	 exit event.  E.g., for the "checkpoint" command, when the
 > +	 current checkpoint/fork exits, linux-fork.c automatically
 > +	 switches to another fork from within target_mourn_inferior,
 > +	 by associating the same inferior/thread to another fork.  We
 > +	 haven't mourned yet at this point, but we must mark any
 > +	 threads left in the process as not-executing so that
 > +	 finish_thread_state marks them stopped (in the user's
 > +	 perspective) if/when we present the stop to the user.  */
 > +      pid_ptid = pid_to_ptid (ptid_get_pid (ecs->ptid));
 > +      set_executing (pid_ptid, 0);
 > +    }
 > +  else
 >      set_executing (ecs->ptid, 0);
 >  
 >    switch (ecs->ws.kind)
 > @@ -6550,6 +6567,7 @@ normal_stop (void)
 >    struct target_waitstatus last;
 >    ptid_t last_ptid;
 >    struct cleanup *old_chain = make_cleanup (null_cleanup, NULL);
 > +  ptid_t pid_ptid;
 >  
 >    get_last_target_status (&last_ptid, &last);
 >  
 > @@ -6559,9 +6577,21 @@ normal_stop (void)
 >       here, so do this before any filtered output.  */
 >    if (!non_stop)
 >      make_cleanup (finish_thread_state_cleanup, &minus_one_ptid);
 > -  else if (last.kind != TARGET_WAITKIND_SIGNALLED
 > -	   && last.kind != TARGET_WAITKIND_EXITED
 > -	   && last.kind != TARGET_WAITKIND_NO_RESUMED)
 > +  else if (last.kind == TARGET_WAITKIND_SIGNALLED
 > +	   || last.kind == TARGET_WAITKIND_EXITED)
 > +    {
 > +      /* On some targets, we may still have live threads in the
 > +	 inferior when we get a process exit event.  E.g., for
 > +	 "checkpoint", when the current checkpoint/fork exits,
 > +	 linux-fork.c automatically switches to another fork from
 > +	 within target_mourn_inferior.  */
 > +      if (!ptid_equal (inferior_ptid, null_ptid))
 > +	{
 > +	  pid_ptid = pid_to_ptid (ptid_get_pid (inferior_ptid));
 > +	  make_cleanup (finish_thread_state_cleanup, &pid_ptid);
 > +	}
 > +    }
 > +  else if (last.kind != TARGET_WAITKIND_NO_RESUMED)
 >      make_cleanup (finish_thread_state_cleanup, &inferior_ptid);
 >  
 >    /* As we're presenting a stop, and potentially removing breakpoints,
 > diff --git a/gdb/testsuite/gdb.base/checkpoint-ns.exp b/gdb/testsuite/gdb.base/checkpoint-ns.exp
 > new file mode 100644
 > index 0000000..d3698ba
 > --- /dev/null
 > +++ b/gdb/testsuite/gdb.base/checkpoint-ns.exp
 > @@ -0,0 +1,26 @@
 > +# Copyright 2015 Free Software Foundation, Inc.
 > +
 > +# This program is free software; you can redistribute it and/or modify
 > +# it under the terms of the GNU General Public License as published by
 > +# the Free Software Foundation; either version 3 of the License, or
 > +# (at your option) any later version.
 > +#
 > +# This program is distributed in the hope that it will be useful,
 > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 > +# GNU General Public License for more details.
 > +#
 > +# You should have received a copy of the GNU General Public License
 > +# along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
 > +
 > +# Test gdb checkpoint and restart in non-stop mode.
 > +
 > +# We drive non-stop mode from a separate file because the whole test
 > +# takes a while to run.  This way, we can test both modes in parallel.
 > +
 > +set saved_gdbflags $GDBFLAGS
 > +append GDBFLAGS " -ex \"set non-stop on\""
 > +
 > +source $srcdir/$subdir/checkpoint.exp
 > +
 > +set GDBFLAGS $saved_gdbflags
 > diff --git a/gdb/testsuite/gdb.base/checkpoint.exp b/gdb/testsuite/gdb.base/checkpoint.exp
 > index 6d94ab6..4a476be 100644
 > --- a/gdb/testsuite/gdb.base/checkpoint.exp
 > +++ b/gdb/testsuite/gdb.base/checkpoint.exp
 > @@ -24,8 +24,10 @@ if {![istarget "*-*-linux*"]} then {
 >      continue
 >  }
 >  
 > -
 > -standard_testfile .c
 > +# Must name the source file explicitly, otherwise when driven by
 > +# checkpoints-ns.exp, we'd try compiling checkpoints-ns.c, which
 > +# doesn't exist.
 > +standard_testfile checkpoint.c
 >  
 >  set pi_txt [gdb_remote_download host ${srcdir}/${subdir}/pi.txt]
 >  if {[is_remote host]} {
 > diff --git a/gdb/thread.c b/gdb/thread.c
 > index 23dfcc9..46b5947 100644
 > --- a/gdb/thread.c
 > +++ b/gdb/thread.c
 > @@ -1279,8 +1279,16 @@ restore_selected_frame (struct frame_id a_frame_id, int frame_level)
 >      }
 >  }
 >  
 > +/* Data used by the cleanup installed by
 > +   'make_cleanup_restore_current_thread'.  */
 > +
 >  struct current_thread_cleanup
 >  {
 > +  /* Next in list of currently installed 'struct
 > +     current_thread_cleanup' cleanups.  See
 > +     'current_thread_cleanup_chain' below.  */
 > +  struct current_thread_cleanup *next;
 > +
 >    ptid_t inferior_ptid;
 >    struct frame_id selected_frame_id;
 >    int selected_frame_level;
 > @@ -1289,6 +1297,29 @@ struct current_thread_cleanup
 >    int was_removable;
 >  };
 >  
 > +/* A chain of currently installed 'struct current_thread_cleanup'
 > +   cleanups.  Restoring the previously selected thread looks up the
 > +   old thread in the thread list by ptid.  If the thread changes ptid,
 > +   we need to update the cleanup's thread structure so the look up
 > +   succeeds.  */
 > +static struct current_thread_cleanup *current_thread_cleanup_chain;
 > +
 > +/* A thread_ptid_changed observer.  Update all currently installed
 > +   current_thread_cleanup cleanups that want to switch back to
 > +   OLD_PTID to switch back to NEW_PTID instead.  */
 > +
 > +static void
 > +restore_current_thread_ptid_changed (ptid_t old_ptid, ptid_t new_ptid)
 > +{
 > +  struct current_thread_cleanup *it;
 > +
 > +  for (it = current_thread_cleanup_chain; it != NULL; it = it->next)
 > +    {
 > +      if (ptid_equal (it->inferior_ptid, old_ptid))
 > +	it->inferior_ptid = new_ptid;
 > +    }
 > +}
 > +
 >  static void
 >  do_restore_current_thread_cleanup (void *arg)
 >  {
 > @@ -1329,6 +1360,8 @@ restore_current_thread_cleanup_dtor (void *arg)
 >    struct thread_info *tp;
 >    struct inferior *inf;
 >  
 > +  current_thread_cleanup_chain = current_thread_cleanup_chain->next;
 > +
 >    tp = find_thread_ptid (old->inferior_ptid);
 >    if (tp)
 >      tp->refcount--;
 > @@ -1362,6 +1395,9 @@ make_cleanup_restore_current_thread (void)
 >    old->inf_id = current_inferior ()->num;
 >    old->was_removable = current_inferior ()->removable;
 >  
 > +  old->next = current_thread_cleanup_chain;
 > +  current_thread_cleanup_chain = old;
 > +
 >    if (!ptid_equal (inferior_ptid, null_ptid))
 >      {
 >        old->was_stopped = is_stopped (inferior_ptid);
 > @@ -1815,4 +1851,6 @@ Show printing of thread events (such as thread start and exit)."), NULL,
 >           &setprintlist, &showprintlist);
 >  
 >    create_internalvar_type_lazy ("_thread", &thread_funcs, NULL);
 > +
 > +  observer_attach_thread_ptid_changed (restore_current_thread_ptid_changed);
 >  }
 > -- 
 > 1.9.3
 > 
 > 

-- 
/dje


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]