This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: signal delivery problem (with pthreads)

> -----Original Message-----
> From: cygwin-owner On Behalf Of Valery A. Frolov
> Sent: 21 September 2004 22:52

> I've checked it and got the same bad result (crash) on 2000, 
> XP and Win98.
> I've installed cygwin bundle for compilation of sig_bug.c on XP, compiled
> sig_bug.c to sig_bug.exe and ran it. Nothing was changed - crash.
> So could someone who got the _successful_ run of sig_bug.exe with recently
> (>1.5.7-1) releases or snapshots of cygwin1.dll send it 
> (sig_bug.exe) to my personal e-mail? 

  Well, here you go; source as well, just in case you have more than one
version of your testcase lying around, so you know exactly what I was

[Actually, on second thoughts, I'm going to send this post to the list, so
I'll send you the files separately.  There have been further developments
while I was writing this post that I thought the list should be informed

  A funny thing happened to me on the way to email this to you, however:  I
thought I'd try running it again, and this time it crashed for the first
time!  However it still works fine almost all the time when run directly
from the command line, but I've noticed that when I run it in a loop with 

for i in 1 2 3 4 5 6 7 8 9 10 ; do ./sb.exe ; done

it crashes more often than not!  This is interesting, and suggests an
interaction with process spawn/forking.  The contents of the .stackdump file
are fairly interesting too:

Exception: STATUS_ACCESS_VIOLATION at eip=0085F030
eax=0000001F ebx=FFFFFFFF ecx=77E75A65 edx=0000001F esi=41516044
ebp=00860000 esp=0085F004 program=C:\artimi.src\davek\test\pthread\sb.exe,
pid 2352, thread unknown (0x208)
cs=001B ds=0023 es=0023 fs=0038 gs=0000 ss=0023
Stack trace:
Frame     Function  Args
00860000  0085F030  (00010000, BC5D1B48, 00000000, 0001000C)
End of stack trace

  This is very strange indeed.  The eip is in between ebp and esp; in other
words, we're executing on the stack!  And look at the value in ecx:  that
happens to be the ret instruction at the end of KERNEL32!IsBadWritePtr.
_Very_ interesting.  Hmm, now I've finally got it to crash under insight!
Although the stack (or perhaps only the stack pointer) has been somewhat
trashed, I can see enough of it...

(gdb) info registers
eax            0x1f	31
ecx            0x77e75a65	2011650661
edx            0x1f	31
ebx            0xffffffff	-1
esp            0xd5f004	0xd5f004
ebp            0xd60000	0xd60000
esi            0x415162eb	1095852779
edi            0x4	4
eip            0xd5f030	0xd5f030
eflags         0x10216	66070
cs             0x1b	27
ss             0x23	35
ds             0x23	35
es             0x23	35
fs             0x38	56
gs             0x0	0

(gdb) info frame
Stack level 0, frame at 0xd5f008:
 eip = 0xd5f030; saved eip 0x0
 Arglist at 0xd5f000, args: 
 Locals at 0xd5f000, Previous frame's sp is 0xd5f008
 Saved registers:
  eip at 0xd5f004

(gdb) frame
#0  0x00d5f030 in ?? ()

(gdb) x/64xw $esp-0x80
0xd5ef84:	0x00d5f208	0x61117310	0x00401090	0x00d5efcc
0xd5ef94:	0x000907d4	0x00160003	0x00d5efbc	0x610e2b47
0xd5efa4:	0x61117310	0x00401090	0x00d5efc8	0x00d60000
0xd5efb4:	0x00000000	0x0000001f	0x00d60000	0x00401163
0xd5efc4:	0x00401090	0x00000001	0x0000000f	0xfffefeff
0xd5efd4:	0x20000000	0x00d5f00c	0x00d5f00c	0x6108e0bc
0xd5efe4:	0x0000001e	0x00000000	0x00000004	0xffffffff
0xd5eff4:	0x415162eb	0x00000004	0x00d60000	0x00d5f030

0xd5f004:	0x00000000	0x00000246	0x00d5f048	0x00000000
0xd5f014:	0x00d5ef84	0x00d5ef84	0x00d5ef84	0x00d5f030
0xd5f024:	0x00000002	0x00000000	0x00000000	0x00000003
0xd5f034:	0x00000000	0x00d5f048	0x0a053c30	0x0a053c88
0xd5f044:	0x0a053c30	0x00d5f058	0x004011ae	0x00000003
0xd5f054:	0x0a053c30	0x00d5f098	0x610a97da	0x00000000
0xd5f064:	0xffffffff	0x00000000	0x00000000	0x00000001
0xd5f074:	0x00000000	0x00000000	0x00000000	0x00000000

  Right, what interesting values do we find there?  0x61117310 is in the
cygwin dll's data area, somewhere above reent_data, and what do we find

(gdb) x/xw 0x61117310
0x61117310 <reent_data+720>:	0x0a053cb8

(gdb) x/s 0x0a053cb8
0xa053cb8:	 "select was interrupted 1 times\n"

  Ok!  It's printf's output buffer!  That ties in with that 0x610e2b47 which
is the return address from the call within printf to vfprintf.

  Next, that 0x00401090 gets there because of this code in my_sleep:

0x00401153 <my_sleep+147>:	mov    %esi,0x4(%esp,1)
0x00401157 <my_sleep+151>:	movl   $0x401090,(%esp,1)
0x0040115e <my_sleep+158>:	call   0x401710 <printf>
0x00401163 <my_sleep+163>:	jmp    0x40114b <my_sleep+139>

(gdb) x/s 0x401090
0x401090 <sig_hnd+48>:	 "select was interrupted %d times\n"

which also explains the 0x00401163.

  So, we can conclude that the SEGV occurred due to some kind of
stack/register trashage that occurred in a call to vfprintf a couple of
stack layers below the printf in my_sleep.  This looks to me to be most
likely another instance of the known
cygwin-vs-threaded-stdio-race-condition(s) that we all hoped Tom Pfaff had
fixed recently....

>And please, specify your arguments for gcc, 
> version of gcc,
> version of cygwin1.dll and version of OS, for example:

  The compiler command was

  gcc -g -O2 sig_bug.c -o sb.exe

or perhaps -O0; I can't actually remember, but the rest of the line is
definitely exactly what I typed.  My version of the DLL is. as I said
before, homebrewed from CVS as of 20041407.  My OS is XP Pro SP1, and uname
output shows...

dk@mace ~> uname -srvmpio
CYGWIN_NT-5.1 1.5.11(0.116/4/2) 2004-07-14 17:31 i686 unknown unknown Cygwin

Can't think of a witty .sigline today....

Unsubscribe info:
Problem reports:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]