malloc crash

Mark Geisert
Tue Oct 26 08:30:13 GMT 2021

Replying to myself to correct something I wrote...

Mark Geisert wrote:
> Takashi Yano wrote:
>> On Mon, 25 Oct 2021 16:36:50 -0700
>> Mark Geisert wrote:
>>> Ken Brown wrote:
>>>> On 10/25/2021 5:29 PM, Mark Geisert wrote:
>>>>> Corinna Vinschen wrote:
>>>>>> On Oct 25 08:35, Ken Brown wrote:
>>>>>>> On 10/25/2021 4:59 AM, Corinna Vinschen wrote:
>>>>>>>> Has the thread already been started at this point?
>>>>>>> Yes, here's the backtrace of that thread:
>>>>>>> Thread 5 (Thread 9692.0x7c4c):
>>>>>>> #0  0x00000001801934f9 in sys_alloc (m=0x18036f860 <_gm_>, nb=1040) at
>>>>>>> ../../../../temp/winsup/cygwin/
>>>>>>> #1  0x0000000180196b96 in dlmalloc (bytes=1024) at
>>>>>>> ../../../../temp/winsup/cygwin/
>>>>>>> #2  0x00000001801993e1 in dlrealloc (oldmem=0x0, bytes=1024) at
>>>>>>> ../../../../temp/winsup/cygwin/
>>>>>>> #3  0x00000001800e8eed in realloc (p=0x0, size=1024) at
>>>>>>> ../../../../temp/winsup/cygwin/
>>>>>> Er... huh?  So both threads are in a malloc function?  This shouldn't
>>>>>> have happened, given the clunky muto guarding malloc calls.  This is
>>>>>> really strange.  Why's the muto not working here?
>>>>> Is it possible both threads have executed malloc_init()?
>>>>> If so, the second one would reinit the muto.
>>>> Or does the fifo_reader thread call a malloc function before the main thread has
>>>> called malloc_init()?  This would presumably cause __malloc_lock() to fail, but
>>>> there's no error check.
>>> If there's a global constructor involved, that is known to happen.  Constructors
>>> are run from dll_crt0_0(), before malloc_init() is called from dll_crt0_1().  See
>>> for the details.
>> So how about moving malloc_init() call from dll_crt0_1() to dll_crl0_0()
>> so that malloc() can be called in fixup_after_fork/exec()?
> It appears simple, but this is a touchy area of code.  The _0 and _1 are two 
> separate phases of process startup.  I'd want to hear Corinna's thoughts on this.
> I'd also like to verify somehow that this is the scenario Ken is hitting.
> When I was researching different mallocs for Cygwin I hit the constructor snag 
> repeatedly.  I did try delaying the constructor-running until after malloc_init(). 
>   More problems.  I did not try moving malloc_init() to before the constructor run.

Apologies; this was many months ago.  What I did try was moving the malloc_init() 
to before running the constructor chain, as Takashi suggested.  That is what gave 
me more problems.  I don't recall what they were, but I reverted that attempt.

The "future malloc" build of Cygwin I'm running doesn't exhibit Ken's issue, as 
far as I can tell.  It has a specific fix to avoid the scenario I've been talking 
about here, but I don't want to take us down that path unless we're sure Ken's 
hitting that same scenario.


More information about the Cygwin-developers mailing list