Segfault in _cygwin_dll_entry

Peter A. Castro doctor@fruitbat.org
Thu Feb 5 23:51:00 GMT 2004


On Thu, 5 Feb 2004, Igor Pechtchanski wrote:

> On Thu, 5 Feb 2004, Larry Hall wrote:
>
> > At 06:10 AM 2/5/2004, peda@sectra.se you wrote:
> > >Hello!
> > >
> > >I have been trying to get LibGGI to build and work on cygwin and
> > >I have the following problem:

Interestingly enough this is the same problem I've been tracking down
with reguard to rebaseing zsh's dll's.
What follows is some analysis I've been doing on zsh and rebase.

> > >When I run an application linked against the resulting cygggi-2.dll
> > >it segfaults in _cygwin_dll_entry@12 (according to gdb) before main
> > >is reached.
> > >
> > >LibGGI consists of three core libraries: ggi, gii and gg.
> > >
> > >Now, cygggi-2.dll is linked against both cyggii-0.dll and
> > >cyggg-0.dll. cyggii-0.dll is in turn linked against cyggg-0.dll. I
> > >don't know if there is any relevance to this dependency tree, but
> > >I include it since programs linked directly against cyggii-0.dll
> > >does _not_ segfault.
> > >
> > >I can not find any relevant difference between how cygggi-2.dll and
> > >cyggii-0.dll are built. None of them defines a _cygwin_dll_entry,
> > >which means that cygwin should provide one, if I understand things
> > >correctly?

If I understand how cygwin works correctly, the cygwin runtime is called
from the native Windows mechanism for loading executables and loading
DLLs.  cygwin_dll_entry() is a function which resolves the cygwin
specific environment and attaches it to the current process/thread of the
DLL.  It appears this mechanism might have a bug or is making an
assumption on how things work which sometimes isn't correct.

> > >strace on the failing application produces no output.
> > >The only other dlls these dlls depend on (directly) are cygwin1.dll
> > >and kernel32.dll (according to depends.exe provided with MSVC)
> > >
> > >How can I get more info on what is going on?
> >
> > I'd suggest 'cygcheck cygggi-2.dll'.  Make sure there's no MS CRT in
> > there anywhere.  If there is, you're on dangerous ground.  Comparing the
> > output of this with that of cygii-0.dll might be instructive too.

In the case of zsh, it's completely cygwin stuff, no MS stuff.

> > >Is it a known problem?
> >
> > No.  If nothing "obvious" turns up in your initial efforts to scope the
> > problem, you're probably going to be best off debugging into the Cygwin
> > DLL to see where it crashes.
>
> One obvious thing to check for is whether the application tries to
> dynamically load a Cygwin-dependent DLL (which may result in attempting to
> load cygwin1.dll dynamically, and that is *not supported*).

I have yet to fully understand just where the fault is, but I do know
this: the .bss segment used by cygwin_dll_entry sometimes is not where it
thinks it it.

I found this while debugging the zsh rebase problem, and so my methods
are a little quirky :)

First, rebase the libzsh-4.1.1.dll and start gdb of zsh.exe, then run it.
It'll break with a segfault occuring inside _cygwin_dll_entry@12.  The
specific instruction is at _cygwin_dll_entry@12+146:

(gdb) disassemble
0x6ff40951 <_cygwin_dll_entry@12+129>:  call   0x6ff41390 <cygwin_detach_dll>
0x6ff40956 <_cygwin_dll_entry@12+134>:  mov    $0xffffffff,%eax
0x6ff4095b <_cygwin_dll_entry@12+139>:  mov    %eax,0x7fd98610
0x6ff40960 <_cygwin_dll_entry@12+144>:  jmp    0x6ff408fb <_cygwin_dll_entry@12+43>
0x6ff40962 <_cygwin_dll_entry@12+146>:  mov    %ecx,0x7fd985e0
                                                    ~~~~~~~~~~
0x6ff40968 <_cygwin_dll_entry@12+152>:  mov    $0x1,%eax
0x6ff4096d <_cygwin_dll_entry@12+157>:  mov    %eax,0x7fd985f0
0x6ff40972 <_cygwin_dll_entry@12+162>:  mov    %edx,0x7fd98600
0x6ff40978 <_cygwin_dll_entry@12+168>:  movl   $0x7fd908a0,0x4(%esp,1)
0x6ff40980 <_cygwin_dll_entry@12+176>:  mov    %ecx,(%esp,1)
0x6ff40983 <_cygwin_dll_entry@12+179>:  call   0x6ff413a0 <cygwin_attach_dll>

So, what's up with 0x7fd985e0 ?  gdb can't seem to resolve it nor access
the address (hence the segfault):

(gdb) info symbol 0x7fd985e0
No symbol matches 0x7fd985e0.
(gdb) x/x 0x7fd985e0
0x7fd985e0:     Cannot access memory at address 0x7fd985e0

Ok, so restore the un-rebased libzsh-4.1.1.dll, start gdb of zsh, set a
break point at main and run it.  It'll stop at the break point, no
faults.  Now, get the address of _cygwin_dll_entry@12 and have a look at
the same section of code:

(gdb) info address _cygwin_dll_entry@12
Symbol "_cygwin_dll_entry@12" is at 0x600f08d0 in a file compiled without debugging.
(gdb) disassemble
0x600f0951 <_cygwin_dll_entry@12+129>:  call   0x600f1390 <cygwin_detach_dll>
0x600f0956 <_cygwin_dll_entry@12+134>:  mov    $0xffffffff,%eax
0x600f095b <_cygwin_dll_entry@12+139>:  mov    %eax,0x600f8610
0x600f0960 <_cygwin_dll_entry@12+144>:  jmp    0x600f08fb <_cygwin_dll_entry@12+43>
0x600f0962 <_cygwin_dll_entry@12+146>:  mov    %ecx,0x600f85e0
                                                    ~~~~~~~~~~
0x600f0968 <_cygwin_dll_entry@12+152>:  mov    $0x1,%eax
0x600f096d <_cygwin_dll_entry@12+157>:  mov    %eax,0x600f85f0
0x600f0972 <_cygwin_dll_entry@12+162>:  mov    %edx,0x600f8600
0x600f0978 <_cygwin_dll_entry@12+168>:  movl   $0x600f08a0,0x4(%esp,1)
0x600f0980 <_cygwin_dll_entry@12+176>:  mov    %ecx,(%esp,1)
0x600f0983 <_cygwin_dll_entry@12+179>:  call   0x600f13a0 <cygwin_attach_dll>

(gdb) info symbol 0x600f85e0
storedHandle in section .bss
(gdb) info address storedHandle
Symbol "storedHandle" is at 0x600f85e0 in a file compiled without debugging.
(gdb) x/x 0x600f85e0
0x600f85e0 <storedHandle>:      0x00000000

Ah!  So, in the un-rebased scenario storedHandle is in a .bss section.
So, rebase libzsh-4.1.1.dll again, start gdb of zsh, and let it run.
It'll break with a segfault, again, occuring inside _cygwin_dll_entry@12.

So, just where is storedHandle?

(gdb) info address storedHandle
Symbol "storedHandle" is at 0x6ff485e0 in a file compiled without debugging.
(gdb) info symbol 0x6ff485e0
storedHandle in section .bss
(gdb) x/x 0x6ff485e0
0x6ff485e0 <storedHandle>:      0x00000000

Ah, but the code thinks storedHandle is at 0x7fd985e0 (which isn't
addressable)!  It turns out that 0x6ff485e0 is the same location this
part of the .bss was loaded at in the non-rebased scenario.  So, where
did things get messed up?  Did Windows load the section and pass a bogus
section address to the dll or is there a bug in the fixup code, or did
cygwin_dll_entry() resolve the handle to the address incorrectly?

I've looked at the code for cygwin_dll_entry and it's straight forward
enough, so I just don't see where things could have gone wrong.  Is this
perhaps a quirk of the C++ environment or have we perhaps found a Windows
bug?

> 	Igor

-- 
Peter A. Castro <doctor@fruitbat.org> or <Peter.Castro@oracle.com>
	"Cats are just autistic Dogs" -- Dr. Tony Attwood

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/



More information about the Cygwin mailing list