[bug found] Re: cygwin hang problem

Joe Buehler jbuehler@hekimian.com
Fri Jul 19 19:09:00 GMT 2002

OK, I think I see what the problem may be.  In the dll_func_load
code (assembly language), the dll linkage code is patched (rewritten)
once the address of the loaded dll function is known.  The problem
is that there is a race -- the new opcode and its argument
are written separately.  What happens is this:

1. a mov instruction is overwritten with 0xe9 to become a jmp
2. another thread executes the jmp before step 3
3. the newly written jmp instruction gets the proper offset written

Since the mov instruction uses an offset from the beginning of the segment,
and the jmp uses an EIP-relative offset, the net effect is that the jmp
goes off in the weeds.  The data in the dll linkage code is what causes
the observed behavior of a jump to twice the value of the linkage data --
the mov instruction references memory just a few bytes further down.

In the core that I am looking at, here is what is at win32_CopySid@12:

0x610f00b8: 0xa1 0xbf 0x00 0x0f 0x61 # mov 0x610f00bf,%eax

This becomes -- at just the wrong moment:

0x610f00b8: 0xe9 0xbf 0x00 0x0f 0x61 # jmp %eip+0x610f00bf

So the locking needs some changing in the dll linkage code.  There is in fact
a comment above dll_func_load that the code may not be thread safe!

Joe Buehler

