Re: fork failure?

Dave Korn wrote:
> Charles Wilson wrote:
>> ModLoad: 75bd0000 75c7a000   C:\Windows\system32\msvcrt.dll
>   Say, what's that doing there?  Might like to check who's pulling it in, just
> in case something's gone all win32 on you that shouldn't be.

It appears to be pulled in by winsock2, which is on-demand loaded by
cygwin, so it doesn't show up in the explicit dependencies as reported
by cygcheck.  But that's all "behind the cygwin layer" -- the way I've
built gnupg2 and libassuan, they don't go behind cygwin's back to access
windows socket functions directly. They use cygwin functionality for that.

>> ModLoad: 6c1b0000 6c1b5000   C:\Windows\system32\avgrsstx.dll
>   Let's hope AVG hasn't gone (even further) over to the dark side.

Aw geez.  I tried running with AVG both enabled and disabled (but not
uninstalled).  There was a difference in the ProcMon output -- obviously
the disabled AVG makes fewer syscalls -- but the gpg-agent behavior was

I guess I'll try to uninstall AVG and see if that makes a difference.

>> which is just after the output window gets:
>> returning from fork: ischild=1, res=0
>> So, this is the right spot.  And $eip is 0x0.  That doesn't tell me much...
>   So, the dreaded jump-to-zero.  Always a tricky one, since by the time you
> get there you have no idea where you came there from.  Except that we suspect
> fork().  I'd set a breakpoint on the start of fork and another one on the ret
> at the end of it, (did you try mingw gdb yet? 

Not yet. Chris S. has recently released an updated mingw gdb based on
7.0, but I haven't installed or tested that one yet.

> it might be easier here than
> windbg since it'll understand the symbols, but if you can't get it to work
> then you can manually look up symbol addresses and set the breakpoints by hex
> address), 

Well, I did this in windbg (manually setting breakpoints).
Unfortunately, they appeared to have no effect -- after "g", it blew
right past them and into the exception.  Maybe I'll have better luck
with mingw-gdb.

First I'm going to rip out a lot of the debugging cruft from my cygwin
DLL, now that I know (part of) it was a wild goose chase.

> and then I'd restart the program, note the value of $esp and verify
> a sane-looking return address on entry to the function, let it run to the end
> of the function and find out if the stack pointer wasn't back at the same
> location or if the return address there had been corrupted.

Ah. Well...that won't actually work.  The *parent* is the only one of
the two that actually /enters/ the fork() function in the normal way,
and thus could be expected to have a reasonable return address (and hit
a breakpoint at the beginning of the function).

The child...not so much. It "enters" fork() by way of the longjmp, using
the jmb_buf set by the parent when IT was inside fork(), before the
parent (via a roundabout method) called CreateProcess to create the
child in the first place.

I suppose I could debug both the parent AND the child: since the forkee
should have exactly the same memory layout (and stack trace) once they
return from fork(), I suppose that I could

  1) look at the parent's stack trace when it is inside fork(). Ditto
     its return address.
  2) after the child longjmp's back into fork() from dll_crt0_1,
     look at its stack trace and return address. (although I can't
     really catch it that early. I can only catch it in the debugger
     just after the CYGWIN_FORK_SLEEP...but at least I'm still
     back inside fork() at that point.

They ought to match in all respects, correct?

> The second of
> those could potentially be tracked down using a hardware breakpoint
> (watchpoint in gdb terminology), the first of those two would require reading
> the code to see why it's not popping and pushing in equal amounts.

But setjmp and longjmp are nasty black magic assembly generated by
winsup/cygwin/gendef... Ow! Stop! That hurts!


