This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [PATCH] [PR gdb/13463] linux-nat: Stop lwp before xfer if running
- From: Doug Evans <dje at google dot com>
- To: Simon Marchi <simon dot marchi at polymtl dot ca>, Pedro Alves <palves at redhat dot com>
- Cc: gdb-patches at sourceware dot org
- Date: Wed, 18 Sep 2013 11:35:45 -0700
- Subject: Re: [PATCH] [PR gdb/13463] linux-nat: Stop lwp before xfer if running
- Authentication-results: sourceware.org; auth=none
- References: <5203f8bd dot SBEnufg91ES5VgBo%simon dot marchi at polymtl dot ca> <521CFEFC dot 5050008 at redhat dot com> <CAFXXi0mpTxwFosB8WDvWWZ8K5EcJyY3gx16dDB8PtuvrEefVHA at mail dot gmail dot com>
Simon Marchi writes:
> On 27 August 2013 15:33, Pedro Alves <palves@redhat.com> wrote:
> > On 08/08/2013 08:59 PM, simon.marchi@polymtl.ca wrote:
> >> (err, resending with setting From: correctly, sorry about that)
> >>
> >> Hi,
> >>
> >> This is an attempt at finding a solution for PR 13463:
> >> http://sourceware.org/bugzilla/show_bug.cgi?id=13463
> >>
> >> IMO, this is a serious bug (at least on x86 Linux). Inserting breakpoints
> >> while the inferior is running does not work. It can be triggered very easily:
> >>
> >> 1. Start a simple program in non-stop/async mode.
> >> 2. Put a breakpoint anywhere in the program.
> >>
> >> You should get an error like
> >>
> >> Cannot access memory at address 0x400504
> >>
> >> What happens is that GDB tries to ptrace read/write the inferior's memory
> >> while the process is not stopped, which is not supported.
> >>
> >> The obvious/naive solution is to stop the process and resume it whenever
> >> we want to do a memory transfer while it is executing.
> >
> > Yeah, gdbserver does this. Starts with prepare_to_access_memory. A
> > prepare -> do -> "unprepare" sequence is possibly more efficient,
> > as we can batch several "do"s over a single pause sequence. I've considered
> > before pushing that prepare/unprepare sequence all the way to gdb core,
> > in order to batch e.g., the whole of prologue analysis + breakpoint
> > insertion, or of shared library list reading, etc., but I thought all
> > the way through all that.
>
> I saw the gdbserver version, it seems to be handled quite cleanly.
> Should I add an equivalent to prepare_to_access_memory to gdb? This
> way other platforms could do apply the same fix if the bug applies to
> them. Either way, this gives me a sense of the code duplication
> between gdb and gdbserver. Scary!
>
> >> I gave it a shot,
> >> and it seems to work for me, but there is probably some cases it does not
> >> cover, maybe other things it breaks or some better way to do it. I ran a
> >> make check and gdb.sum was identical before and after (minus the time).
> >
> > Points I believe should be handled:
> >
> > - it should not resume threads that were meant to be stopped,
> > or were already stopped. target_resume blindly does that.
>
> I believe my fix handles already stopped threads correctly. If the
> thread is meant to be stopped, should I simply stop it and not resume
> it after, will it be in a valid state? What is the right way to check
> if a thread is meant to be stopped?
>
> > - it's quite possible the thread you just tried to stop
> > disappears/exit just as you tried to stop it, and then the
> > xfer fails.
> >
> > On the latter issue, and considering that there's no real
> > guarantee some other thread could be simultaneously poking
> > the same address gdbserver is (there's no ptrace atomic write),
> > gdbserver takes the easy route and always pauses all threads.
> > (the opposite end of the scale would be if gdb/gdbserver
> > force-cloned a thread in the inferior to use it as proxy to
> > peek/poke through.)
>
> I wouldn't consider it a critical bug if the read/write failed because
> the thread stopped right before we peek/poke it, as it does not
> "break" a feature. I suppose it could be addressed in a different
> patch.
>
> Thanks,
>
> Simon
>
> >
> >>
> >> What do you think about it?
> >>
> >> * linux-nat.c (linux_nat_xfer_partial): Interrupt during the
> >> memory transfer, if it is running.
> >>
> >> ---
> >> gdb/linux-nat.c | 18 ++++++++++++++++--
> >> 1 files changed, 16 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> >> index 45a6e5f..b8c0d1c 100644
> >> --- a/gdb/linux-nat.c
> >> +++ b/gdb/linux-nat.c
> >> @@ -294,8 +294,9 @@ static void linux_nat_async (void (*callback)
> >> void *context),
> >> void *context);
> >> static int kill_lwp (int lwpid, int signo);
> >> -
> >> +static int linux_nat_stop_lwp (struct lwp_info *lwp, void *data);
> >> static int stop_callback (struct lwp_info *lp, void *data);
> >> +static void cleanup_target_stop(void *arg);
> >>
> >> static void block_child_signals (sigset_t *prev_mask);
> >> static void restore_child_signals_mask (sigset_t *prev_mask);
> >> @@ -4205,11 +4206,24 @@ linux_nat_xfer_partial (struct target_ops *ops, enum target_object object,
> >> old_chain = save_inferior_ptid ();
> >>
> >> if (is_lwp (inferior_ptid))
> >> - inferior_ptid = pid_to_ptid (GET_LWP (inferior_ptid));
> >> + {
> >> + struct lwp_info* lwp;
> >> + inferior_ptid = pid_to_ptid (GET_LWP (inferior_ptid));
> >> +
> >> + lwp = find_lwp_pid (inferior_ptid);
> >> + if (target_async_permitted && lwp != NULL && !lwp->stopped)
> >> + {
> >> + make_cleanup (cleanup_target_stop, &lwp->ptid);
> >> + linux_nat_stop_lwp (lwp, NULL);
> >> + stop_wait_callback (lwp, NULL);
> >> + }
> >> + }
> >> +
> >>
> >> xfer = linux_ops->to_xfer_partial (ops, object, annex, readbuf, writebuf,
> >> offset, len);
> >> do_cleanups (old_chain);
> >> return xfer;
> >> }
> >>
> >
> > --
> > Pedro Alves
Hi.
Just curious what the status of this is.