Debugging help for fork failure: resource temporarily unavailable

Ryan Johnson ryanjohn@ece.cmu.edu
Thu Apr 14 18:21:00 GMT 2011


On 2:59 PM, Ryan Johnson wrote:
> I wrote a very simple program whose main() prints out the contents of 
> /proc/self/maps, forks, calls foo() and bar(), and finally (if the 
> parent) calls wait().
>
> The trick is, foo() and bar() reside in cygfoo.dll and cygbar.dll 
> respectively, which I compiled to have the same base address: 0x66000000.
>
> The running binary often, but not always, results in those annoying 
> "exception::handle: Exception: STATUS_ACCESS_VIOLATION" messages (the 
> process otherwise appears to complete normally most of the time). 
> However, once in a while the child fails to spawn, with no particular 
> error message to advertise that fact.

After downloading windbg, I was able to capture the stack traces of 
access violations. It turns out there's no retry code at all; it's just 
some access violations cause the exception handler to crash with a 
variable number of additional violations.

At this point I'm officially in over my head. The output below is 
hopefully useful to somebody who knows a lot more than I do about the 
code which is involved here.

Help? Ideas?
Ryan


The first access violation occurred in la-la land:
WARNING: FrameIP not in any known module. Following frames may be wrong.
610203ea 0x5c0073
6111ccca cygwin1!dlerror+0x6aa
610050ab cygwin1!wordfree+0x4d4a
61007040 cygwin1!setprogname+0x36bb
61004c96 cygwin1!dll_crt0__FP11per_process+0x710
61004d3b cygwin1!setprogname+0x32a6
004013c2 cygwin1!setprogname+0x334b
00401015 image00000000_00400000+0x13c2
756633ca image00000000_00400000+0x1015
77619ed2 kernel32!BaseThreadInitThunk+0x12
77619ea5 ntdll32!RtlInitializeExceptionChain+0x63
00000000 ntdll32!RtlInitializeExceptionChain+0x36

Passing the above through gdb's disassembler:
0x5c0073
<_ZN10per_module9run_dtorsEv+24>:    call   *%eax
<exit+21>:   call   0x61144970 <__call_exitprocs>
<cygwin_exit+22>:    call   0x6111ccb0 <exit>
<_cygwin_exit_return+526>:   call   0x61005090 <cygwin_exit>
<cygwin_crt0+28>:    call   *0x405120
<mainCRTStartup+16>: call   0x4013a0 <cygwin_crt0>
kernel32!BaseThreadInitThunk+0x12
... ntdll calls ...

Continuing with windbg from there gave the following:
cYgSiGw00f 11 0x2030 0x28CE8CHEAP[fork.exe]: HEAP: Free Heap block 
2b3ac0 modified at 2b4000 after it was freed
ntdll!EtwpCreateEtwThread+0x1721

Continuing on again:
WARNING: Stack unwind information not available. Following frames may be 
wrong.
660010d9 cyggcc_s_1!_deregister_frame_info_bases+0x48
610203ea cygbar+0x10d9
6111ccca cygwin1!dlerror+0x6aa
77638fba ntdll32!RtlQueryEnvironmentVariable+0x241
77638e5c ntdll32!LdrShutdownProcess+0x141
75667a25 ntdll32!RtlExitUserProcess+0x74
6109d683 kernel32!ExitProcess+0x15
61005360 cygwin1!cygwin32_posix_path_list_p+0xa2f3
610283e1 cygwin1!setprogname+0x3970
61029034 cygwin1!getenv+0x4611
77626ab9 cygwin1!getenv+0x5264
77626a8b ntdll32!RtlDosSearchPath_Ustr+0xada
775f0143 ntdll32!RtlDosSearchPath_Ustr+0xaac
6111ccca ntdll32!KiUserExceptionDispatcher+0xf
610050ab cygwin1!wordfree+0x4d4a
61007040 cygwin1!setprogname+0x36bb
(rest continues as in the first stack trace)

Again decoding with gdb:
<__gcc_deregister_frame+48>: movl   $0x66003000,(%esp)
<_ZN10per_module9run_dtorsEv+24>:    call   *%eax
<exit+21>:   call   0x61144970 <__call_exitprocs>
... ntdll calls...
kernel32!ExitProcess+0x15
<_ZN5pinfo4exitEm+270>:      call   0x61159c90 <ExitProcess@4>
<_Z7do_exiti@4+587>: call   0x6109d570 <_ZN5pinfo4exitEm>
<_ZN7_cygtls11signal_exitEi+284>:    call   0x61005110 <_Z7do_exiti@4>
<_ZN9exception6handleEP17_EXCEPTION_RECORDP15_exception_listP8_CONTEXTPv+1295>:      
call   0x610282c0 <_ZN7_cygtls11signal_exitEi>
... ntdll calls ...
ntdll32!KiUserExceptionDispatcher+0xf
... below this is the stack trace from the original access violation ...

Continuing yet again gives a third access violation inside ntdll32, due 
to a second call to KiUserExceptionDispatcher:
WARNING: Stack unwind information not available. Following frames may be 
wrong.
0028c080 77626ab9 ntdll32!RtlUlonglongByteSwap+0x1605d
0028c0a4 77626a8b ntdll32!RtlDosSearchPath_Ustr+0xada
0028c154 775f0143 ntdll32!RtlDosSearchPath_Ustr+0xaac
0028c4bc 660010d9 ntdll32!KiUserExceptionDispatcher+0xf
0028c4cc 610203ea cygbar+0x10d9
... continues as before ...

At this point we're in an endless loop of access violations in the user 
exception dispatcher leading to the exception dispatcher raising an 
access violation... I figured this would lead to a stack overflow and 
detached.

Surprisingly, the output of the cygwin app contained no mention of 
access violations. However, the errors prevented the child process from 
running global destructors in its dlls:
Child exiting
* * * fork.cpp fini
Parent after fork (child: 7944)
Parent exiting
* * * fork.cpp fini
* * * foo.cpp fini
* * * bar.cpp fini


Rerunning the process a few times turned up a different access 
violation, this one definitely fork-related:
WARNING: Stack unwind information not available. Following frames may be 
wrong.
0028c7dc 61020d38 cygwin1!dlerror+0x74b
0028c7fc 61004c96 cygwin1!dlfork+0x808
0028ce64 00000000 cygwin1!setprogname+0x32a6

Decoded:
<_ZN3dll4initEv+27>: mov    %eax,(%edx)
<_Z13dll_dllcrt0_1Pv+307>:   call   0x61020470 <_ZN3dll4initEv>
<_ZN7_cygtls5call2EPFmPvS0_ES0_S0_+52>:      call   *%ebp

Continuing twice more:
cYgSiGw00f 11 0x20B4 0x28CE8CModLoad: 00000000`76980000 
00000000`76985000   C:\Windows\SysWOW64\psapi.dll
(1918.20b4): Invalid handle - code c0000008 (first chance)
ntdll!KiRaiseUserExceptionDispatcher+0x3a:
00000000`774512f7 8b8424c0000000  mov     eax,dword ptr [rsp+0C0h] 
ss:00000000`0008e310=c0000008
...
(1918.20b4): Invalid handle - code c0000008 (!!! second chance !!!)
ntdll32!ZwClose+0x12:
775ff9e2 83c404          add     esp,4

Sure enough, cygwin noticed this time:
       1 [main] fork 6424 exception::handle: Exception: 
STATUS_ACCESS_VIOLATION
    2997 [main] fork 6424 open_stackdumpfile: Dumping stack trace to 
fork.exe.stackdump
       1 [main] fork 9204 child_info::sync: wait failed, pid 6424, Win32 
error 1812
221888192 [main] fork 9204 fork: child -1 - died waiting for longjmp 
before initialization, retry 10, exit code 0x1000000, errno 11
Parent after fork (child: -1)
Parent exiting
* * * fork.cpp fini
* * * foo.cpp fini
* * * bar.cpp fini




--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list