How to make child of failed fork exit cleanly?
Tue May 3 18:49:00 GMT 2011
On 03/05/2011 11:46 AM, Ryan Johnson wrote:
> 2. When the child does exit, how to prevent finalizers from running
> for dlls which did not load properly?
> Context for the second question: exiting the child tends to trigger
> access violations, often in a pthread_mutex destructor call (la-la
> land). Some of these can be avoided by disabling stack dumping from
> api_fatal (see separate email about alloca and stack walking), but the
> others continue to mystify.
> Overal, AFAICT, the cygwin dll design assumes that all dlls have
> loaded properly, and a failed fork breaks that invariant. I worry that
> some "properly-loaded" dll accesses state of a "not-properly loaded"
The plot thickens... single-stepping through dll finalization, the crash
occurs because of a call to __gcc_deregister_frame, which is inserted
automatically by gcc (to deal with C++ exception handling unwind info?).
Single-stepping into the call is a descent into chaos, with the end
result that the process exits from a kernel32.dll call with an error
code that suggests an access violation occurred (0x000005a).
The cygwin dll in question is statically-linked, loaded at the desired
address, and depends only on cygwin1.dll, cyggcc_s-1.dll, and
cygstdc++-6.dll (all of which are still loaded, their finalizers did not
run yet). It had just executed its own global destructors. No global
initializers had run, because in_forkee was set.
Very strangely, when every child dies (including those automatically
respawned by Windows), the parent also seg faults when calling
gcc_deregister_frame on the same dll! If even one child survives (even
if many had previously crashed), then no error arises. Even more
strangely, if I break into a first child which has a good layout (no
previous failures, current fork will succeed) and delay it long enough
that the parent times out, the parent still suffers the seg fault! What
shared state is there that could cause this to happen?
Disabling dll finalization completely when in_forkee==1 gets rid of the
above problem, but occasionally I'll get a new error in the child:
CloseHandle(pinfo_shared_handle<0x610031BF>) failed void
pinfo::release():1040, Win32 error 6
110356 [main] fork 10556 fork: child -1 - died waiting for longjmp
before initialization, retry 0, exit code 0x100, errno 11
Sometimes, when the child dies as above, the parent will again seg fault
while deregistering a dll (but not always).
At this point I'm thoroughly confused. Does anyone have some
enlightenment to offer?
Gory details below...
Single-instruction stepping yields the following stack trace (sort of --
it doesn't reflect any one stack trace reported by gdb, because the
stack kept changing). Stack frames marked with '*' are those which I
suspect are due to a jump into la-la land; those marked with '+'
correspond to a longjmp call which unwound the stack back to _sigfe an
unknown number of times (at least twice).
*0x75a81136 in KERNEL32!GetPrivateProfileStructA () from
*0x6115e228 in WaitForSingleObject@8 () from /usr/bin/cygwin1.dll
*0x610d63e5 in muto::acquire (this=0x611700c0, ms=4294967295) at
*0x61077dbf in calloc (nmemb=1, size=44) at
*0x61003129 in operator new (s=44) at
*0x610ecece in pthread_mutex::init (mutex=0x67f0900c, attr=0x0,
initializer=0x14) at /home/Ryan/apps/cygwin-src/winsup/cygwin/thread.cc:2746
+0x610c68b5 in __sjfault () from /usr/bin/cygwin1.dll
+0x610eeb63 in pthread_mutex_lock (mutex=0x67f0900c) at
*0x610c6675 in _sigfe () from /usr/bin/cygwin1.dll
*0x610eeb00 in pthread_spinlock::init () at
*0x610c7dc7 in _sigfe_pthread_mutex_lock () from /usr/bin/cygwin1.dll
*0x67f08a40 in cyggcc_s-1!__gthread_mutex_unlock () from
0x67f054ad in cyggcc_s-1!__deregister_frame_info_bases () from
0x660010d9 in __gcc_deregister_frame () from
0x61021d1e in per_module::run_dtors (this=0x61251050) at
0x61161716 in dll::run_dtors (this=0x61251048) at
0x61022b36 in dll_list::detach (this=0x611e3440, retaddr=0x6600124d) at
#3 0x61022bea in cygwin_detach_dll () at
#4 0x610c6665 in _sigfe () from /usr/bin/cygwin1.dll
Very oddly, the parent process segfaults as well, in the same location
as the child, when it tries to exit. This only occurs when the child
crashes enough that windows fails to restart it. If the child crashes
once, but the next child succeeds, the parent does not fault:
#0 0x67f054bc in cyggcc_s-1!__deregister_frame_info_bases () from
#1 0x660010d9 in __gcc_deregister_frame () from
#2 0x61021d1e in per_module::run_dtors (this=0x61251050) at
#3 0x61161766 in dll::run_dtors (this=0x61251048) at
#4 0x61021d70 in dll_global_dtors () at
#5 0x611492b7 in __call_exitprocs (code=0, d=0x0) at
#6 0x6112152a in exit (code=0) at
#7 0x61005fcb in cygwin_exit (n=0) at
#8 0x610081c0 in _cygwin_exit_return () at
#9 0x61005b36 in _cygtls::call2 (this=0x28ce64, func=0x61007a50
<dll_crt0_1(void*)>, arg=0x0, buf=0x28cda4)
#10 0x61005bdb in _cygtls::call (func=0x61007a50 <dll_crt0_1(void*)>,
arg=0x0) at /home/Ryan/apps/cygwin-src/winsup/cygwin/cygtls.cc:62
#11 0x610079bf in _dll_crt0@0 () at
#12 0x004013c2 in cygwin_crt0 ()
#13 0x00401015 in mainCRTStartup ()
More information about the Cygwin-developers