This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: Debugging help for fork failure: resource temporarily unavailable
- From: Ryan Johnson <ryanjohn at ece dot cmu dot edu>
- To: Ryan Johnson <ryanjohn at ece dot cmu dot edu>
- Cc: Jon TURNEY <jon dot turney at dronecode dot org dot uk>, cygwin at cygwin dot com
- Date: Thu, 14 Apr 2011 10:20:36 -0400
- Subject: Re: Debugging help for fork failure: resource temporarily unavailable
- References: <4DA621E7.8040905@ece.cmu.edu>
On 2:59 PM, Ryan Johnson wrote:
I wrote a very simple program whose main() prints out the contents of
/proc/self/maps, forks, calls foo() and bar(), and finally (if the
parent) calls wait().
The trick is, foo() and bar() reside in cygfoo.dll and cygbar.dll
respectively, which I compiled to have the same base address: 0x66000000.
The running binary often, but not always, results in those annoying
"exception::handle: Exception: STATUS_ACCESS_VIOLATION" messages (the
process otherwise appears to complete normally most of the time).
However, once in a while the child fails to spawn, with no particular
error message to advertise that fact.
After downloading windbg, I was able to capture the stack traces of
access violations. It turns out there's no retry code at all; it's just
some access violations cause the exception handler to crash with a
variable number of additional violations.
At this point I'm officially in over my head. The output below is
hopefully useful to somebody who knows a lot more than I do about the
code which is involved here.
Help? Ideas?
Ryan
The first access violation occurred in la-la land:
WARNING: FrameIP not in any known module. Following frames may be wrong.
610203ea 0x5c0073
6111ccca cygwin1!dlerror+0x6aa
610050ab cygwin1!wordfree+0x4d4a
61007040 cygwin1!setprogname+0x36bb
61004c96 cygwin1!dll_crt0__FP11per_process+0x710
61004d3b cygwin1!setprogname+0x32a6
004013c2 cygwin1!setprogname+0x334b
00401015 image00000000_00400000+0x13c2
756633ca image00000000_00400000+0x1015
77619ed2 kernel32!BaseThreadInitThunk+0x12
77619ea5 ntdll32!RtlInitializeExceptionChain+0x63
00000000 ntdll32!RtlInitializeExceptionChain+0x36
Passing the above through gdb's disassembler:
0x5c0073
<_ZN10per_module9run_dtorsEv+24>: call *%eax
<exit+21>: call 0x61144970 <__call_exitprocs>
<cygwin_exit+22>: call 0x6111ccb0 <exit>
<_cygwin_exit_return+526>: call 0x61005090 <cygwin_exit>
<cygwin_crt0+28>: call *0x405120
<mainCRTStartup+16>: call 0x4013a0 <cygwin_crt0>
kernel32!BaseThreadInitThunk+0x12
... ntdll calls ...
Continuing with windbg from there gave the following:
cYgSiGw00f 11 0x2030 0x28CE8CHEAP[fork.exe]: HEAP: Free Heap block
2b3ac0 modified at 2b4000 after it was freed
ntdll!EtwpCreateEtwThread+0x1721
Continuing on again:
WARNING: Stack unwind information not available. Following frames may be
wrong.
660010d9 cyggcc_s_1!_deregister_frame_info_bases+0x48
610203ea cygbar+0x10d9
6111ccca cygwin1!dlerror+0x6aa
77638fba ntdll32!RtlQueryEnvironmentVariable+0x241
77638e5c ntdll32!LdrShutdownProcess+0x141
75667a25 ntdll32!RtlExitUserProcess+0x74
6109d683 kernel32!ExitProcess+0x15
61005360 cygwin1!cygwin32_posix_path_list_p+0xa2f3
610283e1 cygwin1!setprogname+0x3970
61029034 cygwin1!getenv+0x4611
77626ab9 cygwin1!getenv+0x5264
77626a8b ntdll32!RtlDosSearchPath_Ustr+0xada
775f0143 ntdll32!RtlDosSearchPath_Ustr+0xaac
6111ccca ntdll32!KiUserExceptionDispatcher+0xf
610050ab cygwin1!wordfree+0x4d4a
61007040 cygwin1!setprogname+0x36bb
(rest continues as in the first stack trace)
Again decoding with gdb:
<__gcc_deregister_frame+48>: movl $0x66003000,(%esp)
<_ZN10per_module9run_dtorsEv+24>: call *%eax
<exit+21>: call 0x61144970 <__call_exitprocs>
... ntdll calls...
kernel32!ExitProcess+0x15
<_ZN5pinfo4exitEm+270>: call 0x61159c90 <ExitProcess@4>
<_Z7do_exiti@4+587>: call 0x6109d570 <_ZN5pinfo4exitEm>
<_ZN7_cygtls11signal_exitEi+284>: call 0x61005110 <_Z7do_exiti@4>
<_ZN9exception6handleEP17_EXCEPTION_RECORDP15_exception_listP8_CONTEXTPv+1295>:
call 0x610282c0 <_ZN7_cygtls11signal_exitEi>
... ntdll calls ...
ntdll32!KiUserExceptionDispatcher+0xf
... below this is the stack trace from the original access violation ...
Continuing yet again gives a third access violation inside ntdll32, due
to a second call to KiUserExceptionDispatcher:
WARNING: Stack unwind information not available. Following frames may be
wrong.
0028c080 77626ab9 ntdll32!RtlUlonglongByteSwap+0x1605d
0028c0a4 77626a8b ntdll32!RtlDosSearchPath_Ustr+0xada
0028c154 775f0143 ntdll32!RtlDosSearchPath_Ustr+0xaac
0028c4bc 660010d9 ntdll32!KiUserExceptionDispatcher+0xf
0028c4cc 610203ea cygbar+0x10d9
... continues as before ...
At this point we're in an endless loop of access violations in the user
exception dispatcher leading to the exception dispatcher raising an
access violation... I figured this would lead to a stack overflow and
detached.
Surprisingly, the output of the cygwin app contained no mention of
access violations. However, the errors prevented the child process from
running global destructors in its dlls:
Child exiting
* * * fork.cpp fini
Parent after fork (child: 7944)
Parent exiting
* * * fork.cpp fini
* * * foo.cpp fini
* * * bar.cpp fini
Rerunning the process a few times turned up a different access
violation, this one definitely fork-related:
WARNING: Stack unwind information not available. Following frames may be
wrong.
0028c7dc 61020d38 cygwin1!dlerror+0x74b
0028c7fc 61004c96 cygwin1!dlfork+0x808
0028ce64 00000000 cygwin1!setprogname+0x32a6
Decoded:
<_ZN3dll4initEv+27>: mov %eax,(%edx)
<_Z13dll_dllcrt0_1Pv+307>: call 0x61020470 <_ZN3dll4initEv>
<_ZN7_cygtls5call2EPFmPvS0_ES0_S0_+52>: call *%ebp
Continuing twice more:
cYgSiGw00f 11 0x20B4 0x28CE8CModLoad: 00000000`76980000
00000000`76985000 C:\Windows\SysWOW64\psapi.dll
(1918.20b4): Invalid handle - code c0000008 (first chance)
ntdll!KiRaiseUserExceptionDispatcher+0x3a:
00000000`774512f7 8b8424c0000000 mov eax,dword ptr [rsp+0C0h]
ss:00000000`0008e310=c0000008
...
(1918.20b4): Invalid handle - code c0000008 (!!! second chance !!!)
ntdll32!ZwClose+0x12:
775ff9e2 83c404 add esp,4
Sure enough, cygwin noticed this time:
1 [main] fork 6424 exception::handle: Exception:
STATUS_ACCESS_VIOLATION
2997 [main] fork 6424 open_stackdumpfile: Dumping stack trace to
fork.exe.stackdump
1 [main] fork 9204 child_info::sync: wait failed, pid 6424, Win32
error 1812
221888192 [main] fork 9204 fork: child -1 - died waiting for longjmp
before initialization, retry 10, exit code 0x1000000, errno 11
Parent after fork (child: -1)
Parent exiting
* * * fork.cpp fini
* * * foo.cpp fini
* * * bar.cpp fini
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple