(call-process ...) hangs in emacs

Ken Brown kbrown@cornell.edu
Thu Aug 7 11:52:00 GMT 2014


Hi Corinna,

On 8/5/2014 2:40 PM, Corinna Vinschen wrote:
> Hi Ken,
>
> On Aug  5 13:55, Ken Brown wrote:
>> On 8/5/2014 9:58 AM, Corinna Vinschen wrote:
>>> On Aug  5 08:21, Ken Brown wrote:
>>>> === modified file 'src/gmalloc.c'
>>>> --- src/gmalloc.c       2014-03-04 19:02:49 +0000
>>>> +++ src/gmalloc.c       2014-08-05 01:35:38 +0000
>>>> @@ -490,8 +490,8 @@
>>>>   }
>>>>
>>>>   #ifdef USE_PTHREAD
>>>> -pthread_mutex_t _malloc_mutex = PTHREAD_MUTEX_INITIALIZER;
>>>> -pthread_mutex_t _aligned_blocks_mutex = PTHREAD_MUTEX_INITIALIZER;
>>>> +pthread_mutex_t _malloc_mutex;
>>>> +pthread_mutex_t _aligned_blocks_mutex;
>>>>   int _malloc_thread_enabled_p;
>>>>
>>>>   static void
>>>> @@ -526,8 +526,11 @@
>>>>        initialized mutexes when they are used first.  To avoid such a
>>>>        situation, we initialize mutexes here while their use is
>>>>        disabled in malloc etc.  */
>>>> -  pthread_mutex_init (&_malloc_mutex, NULL);
>>>> -  pthread_mutex_init (&_aligned_blocks_mutex, NULL);
>>>> +  pthread_mutexattr_t attr1, attr2;
>>>> +  pthread_mutexattr_settype (&attr1, PTHREAD_MUTEX_NORMAL);
>>>> +  pthread_mutexattr_settype (&attr2, PTHREAD_MUTEX_NORMAL);
>>>> +  pthread_mutex_init (&_malloc_mutex, &attr1);
>>>> +  pthread_mutex_init (&_aligned_blocks_mutex, &attr2);
>>>>     pthread_atfork (malloc_atfork_handler_prepare,
>>>>                    malloc_atfork_handler_parent,
>>>>                    malloc_atfork_handler_child);
>>>>
>>>>
>>>> The first hunk avoids the double initialization, but I don't understand why
>>>> the second hunk does anything.  Since PTHREAD_MUTEX_NORMAL is now the
>>>> default, shouldn't calling pthread_mutex_init with NULL second argument be
>>>> equivalent to my calls to pthread_mutexattr_settype?  Does this indicate a
>>>> Cygwin bug, or am I misunderstanding something?
>>>
>>> AFAICS you're missing something.  Your pthread_mutexattr_t attr1, attr2
>>> are not initialized.  They contain some random values, thus they are not
>>> good objects.  The calls to pthread_mutexattr_settype as well as the
>>> calls to pthread_mutex_init will fail with EINVAL, but you won't see it
>>> due to missing error handling, and you end up without mutexes at all.
>>> If you call pthread_mutexattr_init before calling
>>> pthread_mutexattr_settype the situation shoul;d be the same as before.
>>
>> Thanks for catching my mistake.  Your earlier suggestion about explicitly
>> setting the pthread_mutexes to be ERRORCHECK mutexes seems to fix the
>> problem (as long as I remember to call pthread_mutexattr_init).  The revised
>> patch is attached.  I went back to using both the static and dynamic
>> initializations as in the original code, since you said that's harmless.
>
> I'm glad to read that, but I'm still a little bit concerned.  If your
> code works with ERRORCHECK mutexes but hangs with NORMAL mutexes, you
> *might* miss an error case.
>
> I'd suggest to tweak the pthread_mutex_lock/unlock calls and log the
> threads calling it.  It looks like the same thread calls malloc from
> malloc for some reason and it might be interesting to learn how that
> happens and if it's really ok in this scenario, because it seems to
> be unexpected by the code.

I think I found the problem with NORMAL mutexes.  emacs calls 
pthread_atfork after initializing the mutexes, and the resulting 
'prepare' handler locks the mutexes.  (The parent and child handlers 
unlock them.)  So when emacs calls fork, the mutexes are locked, and 
shortly thereafter the Cygwin DLL calls calloc, leading to a deadlock. 
Here's a gdb backtrace showing the sequence of calls:

#0  malloc (size=size@entry=40) at gmalloc.c:919
#1  0x0053fc28 in calloc (nmemb=1, size=40) at gmalloc.c:1510
#2  0x61082074 in calloc (nmemb=1, size=40)
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/malloc_wrapper.cc:100
#3  0x61003177 in operator new (s=s@entry=40)
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/cxx.cc:23
#4  0x610fc9d3 in pthread_mutex::init (mutex=0x61187d34 <reent_data+852>,
     attr=0x0, initializer=0x12)
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/thread.cc:3118
#5  0x610fcc13 in pthread_mutex_lock (mutex=0x61187d34 <reent_data+852>)
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/thread.cc:3170
#6  0x611319d8 in __fp_lock (ptr=0x61187cd0 <reent_data+752>)
     at /usr/src/debug/cygwin-1.7.31-3/newlib/libc/stdio/findfp.c:287
#7  0x61154f75 in _fwalk (ptr=0x28d544,
     function=function@entry=0x611319c0 <__fp_lock>)
     at /usr/src/debug/cygwin-1.7.31-3/newlib/libc/stdio/fwalk.c:50
#8  0x61131dea in __fp_lock_all ()
     at /usr/src/debug/cygwin-1.7.31-3/newlib/libc/stdio/findfp.c:307
#9  0x610fa45e in pthread::atforkprepare ()
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/thread.cc:2031
#10 0x61076292 in lock_pthread (this=<synthetic pointer>)
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/sigproc.h:137
#11 hold_everything (x=<synthetic pointer>, this=<synthetic pointer>)
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/sigproc.h:169
#12 fork () at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/fork.cc:582

Is there a better way to deal with this issue than using ERRORCHECK mutexes?

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list