This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Snapshot 20040225: make hangs/errors out


On Thu, Mar 04, 2004 at 10:59:48AM -0500, Christopher Faylor wrote:
>On Wed, Mar 03, 2004 at 09:14:28PM -0500, Christopher Faylor wrote:
>>On Wed, Mar 03, 2004 at 06:16:55PM -0500, Rolf Campbell wrote:
>>>Christopher Faylor wrote:
>>>>>>No, but I'll try to catch one.  (I removed the strace from my script.)
>>>>>
>>>>>Ok, caught two already.  (Produced with attached script + Makefile)
>>>>
>>>>Not much to there, unfortunately.
>>>>
>>>>Out of curiousity, can you duplicate this problem with the snapshot?  I
>>>>see that this is your own build, probably built with
>>>>--enable-debugging.
>>>>
>>>>I've been diligently testing things with the snapshot rather than my
>>>>own build because I was trying to debug what was in the subject.
>>>>Snapshots aren't built with --enable-debugging.  If this is just an
>>>>artifact from building with --enable-debugging, then I'm not too
>>>>worried.
>>>
>>>Ok, I've been running the script with the '25 snapshot all day, with 44
>>>failures.  All the same type of failures I was seeing with the cvs
>>>(with --enable-debugging).  Unfortunitely, the ethernet card on my home
>>>machine broke so for now I'll upload one of the strace files to a
>>>geocites site.  Nothing looks suspicious to me in the strace, maybe
>>>it's a bug in make?  http://www.geocities.com/endlisnis/Temp/freeze.zip
>>
>>Thanks.  Unfortunately, I don't see anything more here than in the other
>>strace output.
>>
>>I did manage to duplicate this after 1437 repetitions or so.  My strace
>>didn't show anything either, unfortunately, but now maybe I can slowly
>>get to the bottom of the problem.
>
>Weird.  Now that I've managed to duplicate it, I can do so at will.  I
>guess that's good news.
>
>I see what is causing the symptom but not what is causing the problem.
>I spent a sleepless night modelling multi-threaded signal interrupts
>in my head but I'm still not any closer to understanding the problem.
>
>The problem is that malloc allocates some memory, puts the address of
>the memory in the eax register, and then returns.  In the meantime, two
>signals have come in, so rather than return immediately, malloc returns
>to the signal handler and then the signal handler is called again.  In
>some cases, this causes the eax register to become zero and so make
>(rightly) complains.  In theory, this shouldn't happen since the eax
>register should have been saved on the stack.
>
>Nope.  Typing an explanation doesn't help me figure this out.  Bummer.

I think I may have figured this out.

It wasn't the eax register being zeroed.  It was actually the test for
zero returning improper values due to being interrupted by a signal.

I made a fix last night that allowed me to run this for 2500+
iterations.  Of course, I have managed to do that before without error,
so that doesn't mean much, I guess.  Backing the change out resulted in
a 'virtual memory exhausted' error in less than a hundred iterations,
however.  Odd that I can duplicate it so readily now.  I think my
computer was previously trying to shield me from the pain of debugging
this problem.

There is a new snapshot up now with my fix in it.  Please try it.

cgf

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]