This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Follow-fork-mode / detach-on-fork expected behavior?


On 5/16/2014 5:30 AM, Pedro Alves wrote:
> On 04/24/2014 11:05 PM, Breazeal, Don wrote:
>> Hi
>>
>> I'm working on implementation of follow-fork in gdbserver.  My intent is to make it work just like it works in native GDB.  However, I am confused by what looks like inconsistent behavior in native GDB.  I'm hoping to get some feedback on my observations so that I know how to proceed.  I want to make sure things are working in native GDB before going any further with gdbserver.
>>
>> Apologies for the length of this email.  The only way I can think of to explain my questions is by describing what I see in a test case.  I'm using a test case that uses 'fork' (not 'vfork') in all-stop mode (gdb.base/foll-fork).  Aside from the fork mode settings, the commands are:
>> (gdb) set verbose   # to see the fork msgs
>> (gdb) break main
>> (gdb) run
>> (gdb) next 2        # this executes past the fork call
>>
>> The behavior is inconsistent when following the child, depending on the setting for detach-on-fork.  Below is the behavior I see in the four possible combinations of fork settings after the 'next 2' command is entered, along with my specific questions:
>>
>> 1) follow parent / detach child (default settings)
>>   -  prints msg about detaching after fork
>>   -  stops after the next command in the parent
>>   -  one inferior left
>>
>> 2) follow parent / don't detach child
>>   -  prints [New process] msg
>>   -  prints symbol reading/loading msgs
>>   -  stops in parent after next
>>   -  two inferiors left, info inferiors shows pids of both
>>
>> So far, so good, this is what I expect.
>>
>> 3) follow child / detach parent
>>   -  prints msg about attach to child after fork
>>   -  prints [New process] msg
>>   -  prints [Switching to process ] msg
>>   -  stops in child after 'next' command
>>   -  two inferiors left, info inferiors shows parent 'null'
>>
>> This looks like there might be a problem:
>>   Q1: shouldn't there only be one inferior?
> 
> Yeah.  With detach-on-fork enabled, I agree that that's
> the expected behavior.  And I think the new process should

Oh...ugh.  I thought I had figured this out, but had reached the
opposite conclusion.  This was based on a couple of things:

 * Text in the GDB manual (Yao also pointed this out to me):
https://sourceware.org/gdb/onlinedocs/gdb/Inferiors-and-Programs.html
mentions that "Inferiors may be created before a process runs, and may
be retained after a process exits." and "After the successful completion
of a command such as detach, detach inferiors, kill or kill inferiors,
or after a normal process exit, the inferior is still valid and listed
with info inferiors, ready to be restarted."

It seemed to me like "detach-on-fork" should work like the detach
command and keep the inferior.  I guess the argument opposing this would
be that the parent and child inferiors would just be identical to each
other. However, if there was an exec in the followed process, this would
no longer be the case.

 * PR gdb/14808, which is a problem where the inferior of a detached
parent is modified when it shouldn't be.

I was proceeding under the assumptions that (1) a detached parent should
show up in the inferiors list so that it could be re-run later, and (2)
a detached child should not, since it had never really been attached
from a user perspective.

My plan has been to implement the remote case, then see if I could
consolidate some of the code into a common follow-fork layer.

Do I need to change course?  Would it make sense for me to proceed under
the assumptions above, and deal with the 'detached inferior' issue
later, or should that be resolved before implementing remote follow-fork?

Thanks
--Don


> reuse the old inferior id.  The reason I hadn't done things
> that way to begin with, is actually the vfork case.
> For vfork, we need to hold on to the parent until the child
> execs/exits.  In the old days, linux-nat.c itself would hold on to
> the lwp, behind the core's back.  The core had no clue about
> vfork-done.  Nowadays, we have that modelled with
> TARGET_WAITKIND_VFORK_DONE.  We need that to be able to correctly
> remove all child breakpoints from the parent at child exec/exit
> time.
> 
> We could fix that by getting rid of the parent inferior from the
> core, going back to having linux-nat.c itself keep track of the
> parent behind the cores back, knowing that it needs to remove
> breakpoints from the parent when a vfork_done arrives (like it used
> to be done before TARGET_WAITKIND_VFORK_DONE).  But, how to
> do that in the remote/gdbserver case?  gdbserver will have no clue
> about software breakpoints planted in memory directly by GDB.
> 
> One way I've thought before to fix/handle this, would
> be to still leave the parent inferior in the inferior list,
> so most of the core wouldn't need to change, but hide the
> parent inferior from the user.  We could for example make
> the parent's inferior's number be negative (like internal
> breakpoints are negative).
> 
> Sounds doable, but I've never tried it.  I recall of at least
> one complication.  Like, if the parent is hidden, and the user
> decides "oops, I want to keep debugging the parent", and does
> "attach PID", with PID being the parent's PID.  ptrace would
> of course complain, because GDB is already attached to the process.
> My immediate reaction is that the core could look at PID and check
> if it's already attached before passing the PID down to the target.
> But, given that PID is just a string that is interpreted by the
> target, we can't have the core interpret it as a number.  So we
> need something else.  Maybe let the target return "I'm already
> attached, and this is the PID in number form".  Easy for
> native, requires extensions for remote.  Well, given we need
> remote multi-process extensions to see this happen, perhaps
> just just ignore the issue and do the pid comparison as
> a number anyway (if not fake).
> 
>>   Q2: should the child have stopped?
> 
> Yes.  That's what "follow" means.  That's how GDB always
> behaved, even before it learned about "info inferiors".
> GDB 6.8 could follow forks, and had follow-fork/detach-on-fork
> settings already, but couldn't run two fork processes
> at once.  You'd have to choose which to debug with
> "info forks" / "fork".
> 
>> The manual doesn't make this completely clear.
>>
>> 4) follow child / don't detach parent
>>   -  prints msg about attach to child after fork
>>   -  prints [New process] msg
>>   -  prints symbol reading/loading msgs
>>   -  child runs to completion
>>   -  two inferiors left, info inferiors shows child 'null'
>>
>> Something seems wrong here.  
>>   Q3: to be consistent, shouldn't the child process either have stopped after the 'next' command
>> in both (3) and (4)
> 
> Yes.
> 
>> or run to completion in both cases?
> 
> No.
> 
>>
>> I'd appreciate it if someone could clarify the expected behavior for me, or if what I'm seeing is expected, explain the rationale.  If something needs to be fixed in the native implementation, I'll want to look at that before continuing with the remote case.
>> Thanks
> 



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]