This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: Re: Re: Debugging help for fork failure: resource temporarily unavailable
- From: Ryan Johnson <ryanjohn at ece dot cmu dot edu>
- Cc: Jon TURNEY <jon dot turney at dronecode dot org dot uk>, cygwin at cygwin dot com
- Date: Wed, 13 Apr 2011 18:21:27 -0400
- Subject: Re: Re: Re: Debugging help for fork failure: resource temporarily unavailable
- References: <4DA5EF8C.8060206@ece.cmu.edu>
On 2:59 PM, Ryan Johnson wrote:
On 2:59 PM, Jon TURNEY wrote:
I look forward to reading your patches :-)
I think it's still rather premature to be cooking up a patch,
unfortunately -- I'm not convinced I know yet where the real problem
lies. Without some data to back up my speculation (which seems hard to
come by), any patch I might write would have a high probability of
joining other accumulated band-aids such as reserve_upto().
Open questions (for my ignorant self, at least) include:
- Does Windows always load a given dll at the same address when its
base address is already occupied?
- Does fork() always load DLLs in the same order that the parent
loaded them? This would probably be helpful to know even in cases
where no error arises, because it's a necessary precursor to fork
failures, and the code seems to assume it's true.
- Is it ever possible for fork() to unload BLODA dlls?
- Do injected dlls arrive before or after statically-linked dlls? Or
can it be either one?
- At fork time, does cygwin mogrify some generic child process to look
like the parent, or is the child another "normal" run of the parent's
executable image followed by plastic surgery to make heap, stack, etc.
match? I had been assuming the former, but should probably ask.
Update: I wrote a very simple program whose main() prints out the
contents of /proc/self/maps, forks, calls foo() and bar(), and finally
(if the parent) calls wait().
The trick is, foo() and bar() reside in cygfoo.dll and cygbar.dll
respectively, which I compiled to have the same base address: 0x66000000.
The running binary often, but not always, results in those annoying
"exception::handle: Exception: STATUS_ACCESS_VIOLATION" messages (the
process otherwise appears to complete normally most of the time).
However, once in a while the child fails to spawn, with no particular
error message to advertise that fact.
Running inside gdb (inside a plain cygwin window) gives the following
(I'm on Win7 x64, with all the latest packages as of yesterday afternoon):
$ gdb
GNU gdb 6.8.0.20080328-cvs (cygwin-special)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-cygwin".
(gdb) file fork
Reading symbols from /home/Ryan/experiments/fork-tests/fork...done.
(gdb) run
Starting program: /home/Ryan/experiments/fork-tests/fork
[New thread 8864.0x2120]
Error: dll starting at 0x77190000 not found.
Error: dll starting at 0x75650000 not found.
Error: dll starting at 0x77190000 not found.
Error: dll starting at 0x76d20000 not found.
[New thread 8864.0x2710]
+ + + bar.cpp init
+ + + foo.cpp init
+ + + fork.cpp init
00400000-00410000 rw-s 00401000 2C36:17C8 33776997205430206
/home/Ryan/experiments/fork-tests/fork.exe
775E0000-77760000 r-xs 00000000 2C36:17C8 281474976927378
/cygdrive/c/Windows/SysWOW64/ntdll.dll
75650000-75760000 r-xs 756632D3 2C36:17C8 281474976927037
/cygdrive/c/Windows/syswow64/kernel32.dll
75350000-75396000 r-xs 75357478 2C36:17C8 281474976925120
/cygdrive/c/Windows/syswow64/KERNELBASE.dll
66000000-66012000 rw-s 660011F0 2C36:17C8 3940649674730545
/home/Ryan/experiments/fork-tests/cygbar.dll
61000000-61450000 r-xs 6106F960 2C36:17C8 844424930325032
/usr/bin/cygwin1.dll
75A60000-75B00000 r-xs 75A749E5 2C36:17C8 281474976927159
/cygdrive/c/Windows/syswow64/ADVAPI32.DLL
75050000-750FC000 rw-s 7505A472 2C36:17C8 281474976749314
/cygdrive/c/Windows/syswow64/msvcrt.dll
76840000-76859000 r-xs 76844975 2C36:17C8 281474976749841
/cygdrive/c/Windows/SysWOW64/sechost.dll
76750000-76840000 r-xs 76760569 2C36:17C8 281474976924963
/cygdrive/c/Windows/syswow64/RPCRT4.dll
74CD0000-74D30000 r-xs 74CEA3B3 2C36:17C8 281474976924512
/cygdrive/c/Windows/syswow64/SspiCli.dll
74CC0000-74CCC000 r-xp 74CC10E1 2C36:17C8 281474976748415
/cygdrive/c/Windows/syswow64/CRYPTBASE.dll
67F00000-67F0F000 rw-s 67F08920 2C36:17C8 562949954003711
/usr/bin/cyggcc_s-1.dll
6C480000-6C545000 rw-s 6C485110 2C36:17C8 562949954003739
/usr/bin/cygstdc++-6.dll
002B0000-002C2000 rw-p 002B11F0 2C36:17C8 2533274791177101
/home/Ryan/experiments/fork-tests/cygfoo.dll
753A0000-754A0000 rw-p 753BB6ED 2C36:17C8 281474976926904
/cygdrive/c/Windows/system32/user32.dll
74FC0000-75050000 rw-p 74FD6343 2C36:17C8 281474976926610
/cygdrive/c/Windows/syswow64/GDI32.dll
754D0000-754DA000 rw-p 754D36A0 2C36:17C8 281474976749103
/cygdrive/c/Windows/syswow64/LPK.dll
757D0000-7586D000 rw-p 75803FD7 2C36:17C8 281474976927082
/cygdrive/c/Windows/syswow64/USP10.dll
754E0000-75540000 r-xp 754F158F 2C36:17C8 281474976924115
/cygdrive/c/Windows/system32/IMM32.DLL
74EF0000-74FBC000 rw-p 74EF168B 2C36:17C8 281474976749206
/cygdrive/c/Windows/syswow64/MSCTF.dll
76980000-76985000 rw-p 76981438 2C36:17C8 281474976749672
/cygdrive/c/Windows/system32/psapi.dll
Before fork
0 [main] fork 9472 exception::handle: Exception:
STATUS_ACCESS_VIOLATION
559 [main] fork 9472 open_stackdumpfile: Dumping stack trace to
fork.exe.stackdump
0 [main] fork 9132 exception::handle: Exception:
STATUS_ACCESS_VIOLATION
525 [main] fork 9132 open_stackdumpfile: Dumping stack trace to
fork.exe.stackdump
0 [main] fork 7812 exception::handle: Exception:
STATUS_ACCESS_VIOLATION
531 [main] fork 7812 open_stackdumpfile: Dumping stack trace to
fork.exe.stackdump
0 [main] fork 7648 exception::handle: Exception:
STATUS_ACCESS_VIOLATION
521 [main] fork 7648 open_stackdumpfile: Dumping stack trace to
fork.exe.stackdump
0 [main] fork 1960 exception::handle: Exception:
STATUS_ACCESS_VIOLATION
657 [main] fork 1960 open_stackdumpfile: Dumping stack trace to
fork.exe.stackdump
0 [main] fork 4480 exception::handle: Exception:
STATUS_ACCESS_VIOLATION
914 [main] fork 4480 open_stackdumpfile: Dumping stack trace to
fork.exe.stackdump
0 [main] fork 8864 fork: child -1 - died waiting for longjmp
before initialization, retry 0, exit code 0x600, errno 11
Parent after fork (child: -1)
Parent exiting
* * * fork.cpp fini
* * * foo.cpp fini
* * * bar.cpp fini
Program exited normally.
The above raises several interesting questions:
1. Why doesn't /proc/self/maps contain all the dlls gdb complains about?
x7565 is kernel32.dll, but there's no sign of x7719 or x76d2. I tried
nirsoft's 'InjectedDLL' but none of the dlls it finds have those bases,
and windbg doesn't report them either.
2. What determines which of the many bad things can happen at fork()
time? I've seen "resource temporarily unavailable", "died waiting for
longjmp" , and now this "STATUS_ACCESS_VIOLATION" (which invariably
happens an even number of times but is usually not fatal) ?
3. What code is raising the access violation, and is there a way to make
gdb catch it?
(gdb) catch load
catch of library loads not yet implemented on this platform
(gdb) catch throw
Function "__cxa_throw" not defined.
(gdb) catch exception
Unable to insert catchpoint. Is this an Ada main program?
(gdb) catch signal SIGSEGV
Catch of signal not yet implemented
4. Strace shows that each pair of access violations corresponds to a
failed attempt at forking. I guess after three failures cygwin gives up
and triggers the waiting-for-longjmp error?
Unfortunately I haven't been able to reproduce the resource unavailable
flavor of error yet...
Thoughts?
Ryan
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple