This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Follow-fork-mode / detach-on-fork expected behavior?


On 04/24/2014 11:05 PM, Breazeal, Don wrote:
> Hi
> 
> I'm working on implementation of follow-fork in gdbserver.  My intent is to make it work just like it works in native GDB.  However, I am confused by what looks like inconsistent behavior in native GDB.  I'm hoping to get some feedback on my observations so that I know how to proceed.  I want to make sure things are working in native GDB before going any further with gdbserver.
> 
> Apologies for the length of this email.  The only way I can think of to explain my questions is by describing what I see in a test case.  I'm using a test case that uses 'fork' (not 'vfork') in all-stop mode (gdb.base/foll-fork).  Aside from the fork mode settings, the commands are:
> (gdb) set verbose   # to see the fork msgs
> (gdb) break main
> (gdb) run
> (gdb) next 2        # this executes past the fork call
> 
> The behavior is inconsistent when following the child, depending on the setting for detach-on-fork.  Below is the behavior I see in the four possible combinations of fork settings after the 'next 2' command is entered, along with my specific questions:
> 
> 1) follow parent / detach child (default settings)
>   -  prints msg about detaching after fork
>   -  stops after the next command in the parent
>   -  one inferior left
> 
> 2) follow parent / don't detach child
>   -  prints [New process] msg
>   -  prints symbol reading/loading msgs
>   -  stops in parent after next
>   -  two inferiors left, info inferiors shows pids of both
> 
> So far, so good, this is what I expect.
> 
> 3) follow child / detach parent
>   -  prints msg about attach to child after fork
>   -  prints [New process] msg
>   -  prints [Switching to process ] msg
>   -  stops in child after 'next' command
>   -  two inferiors left, info inferiors shows parent 'null'
> 
> This looks like there might be a problem:
>   Q1: shouldn't there only be one inferior?

Yeah.  With detach-on-fork enabled, I agree that that's
the expected behavior.  And I think the new process should
reuse the old inferior id.  The reason I hadn't done things
that way to begin with, is actually the vfork case.
For vfork, we need to hold on to the parent until the child
execs/exits.  In the old days, linux-nat.c itself would hold on to
the lwp, behind the core's back.  The core had no clue about
vfork-done.  Nowadays, we have that modelled with
TARGET_WAITKIND_VFORK_DONE.  We need that to be able to correctly
remove all child breakpoints from the parent at child exec/exit
time.

We could fix that by getting rid of the parent inferior from the
core, going back to having linux-nat.c itself keep track of the
parent behind the cores back, knowing that it needs to remove
breakpoints from the parent when a vfork_done arrives (like it used
to be done before TARGET_WAITKIND_VFORK_DONE).  But, how to
do that in the remote/gdbserver case?  gdbserver will have no clue
about software breakpoints planted in memory directly by GDB.

One way I've thought before to fix/handle this, would
be to still leave the parent inferior in the inferior list,
so most of the core wouldn't need to change, but hide the
parent inferior from the user.  We could for example make
the parent's inferior's number be negative (like internal
breakpoints are negative).

Sounds doable, but I've never tried it.  I recall of at least
one complication.  Like, if the parent is hidden, and the user
decides "oops, I want to keep debugging the parent", and does
"attach PID", with PID being the parent's PID.  ptrace would
of course complain, because GDB is already attached to the process.
My immediate reaction is that the core could look at PID and check
if it's already attached before passing the PID down to the target.
But, given that PID is just a string that is interpreted by the
target, we can't have the core interpret it as a number.  So we
need something else.  Maybe let the target return "I'm already
attached, and this is the PID in number form".  Easy for
native, requires extensions for remote.  Well, given we need
remote multi-process extensions to see this happen, perhaps
just just ignore the issue and do the pid comparison as
a number anyway (if not fake).

>   Q2: should the child have stopped?

Yes.  That's what "follow" means.  That's how GDB always
behaved, even before it learned about "info inferiors".
GDB 6.8 could follow forks, and had follow-fork/detach-on-fork
settings already, but couldn't run two fork processes
at once.  You'd have to choose which to debug with
"info forks" / "fork".

> The manual doesn't make this completely clear.
> 
> 4) follow child / don't detach parent
>   -  prints msg about attach to child after fork
>   -  prints [New process] msg
>   -  prints symbol reading/loading msgs
>   -  child runs to completion
>   -  two inferiors left, info inferiors shows child 'null'
> 
> Something seems wrong here.  
>   Q3: to be consistent, shouldn't the child process either have stopped after the 'next' command
> in both (3) and (4)

Yes.

> or run to completion in both cases?

No.

> 
> I'd appreciate it if someone could clarify the expected behavior for me, or if what I'm seeing is expected, explain the rationale.  If something needs to be fixed in the native implementation, I'll want to look at that before continuing with the remote case.
> Thanks

-- 
Pedro Alves


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]