malloc crash

Takashi Yano takashi.yano@nifty.ne.jp
Tue Oct 26 08:52:29 GMT 2021


On Tue, 26 Oct 2021 01:30:13 -0700
Mark Geisert wrote:
> Replying to myself to correct something I wrote...
> 
> Mark Geisert wrote:
> > Takashi Yano wrote:
> >> On Mon, 25 Oct 2021 16:36:50 -0700
> >> Mark Geisert wrote:
> >>> Ken Brown wrote:
> >>>> On 10/25/2021 5:29 PM, Mark Geisert wrote:
> >>>>> Corinna Vinschen wrote:
> >>>>>> Er... huh?  So both threads are in a malloc function?  This shouldn't
> >>>>>> have happened, given the clunky muto guarding malloc calls.  This is
> >>>>>> really strange.  Why's the muto not working here?
> >>>>>
> >>>>> Is it possible both threads have executed malloc_init()?
> >>>>> If so, the second one would reinit the muto.
> >>>>
> >>>> Or does the fifo_reader thread call a malloc function before the main thread has
> >>>> called malloc_init()?  This would presumably cause __malloc_lock() to fail, but
> >>>> there's no error check.
> >>>
> >>> If there's a global constructor involved, that is known to happen.  Constructors
> >>> are run from dll_crt0_0(), before malloc_init() is called from dll_crt0_1().  See
> >>> dcrt0.cc for the details.
> >>
> >> So how about moving malloc_init() call from dll_crt0_1() to dll_crl0_0()
> >> so that malloc() can be called in fixup_after_fork/exec()?
> > 
> > It appears simple, but this is a touchy area of code.  The _0 and _1 are two 
> > separate phases of process startup.  I'd want to hear Corinna's thoughts on this.
> > 
> > I'd also like to verify somehow that this is the scenario Ken is hitting.
> > 
> > When I was researching different mallocs for Cygwin I hit the constructor snag 
> > repeatedly.  I did try delaying the constructor-running until after malloc_init(). 
> >   More problems.  I did not try moving malloc_init() to before the constructor run.
> 
> Apologies; this was many months ago.  What I did try was moving the malloc_init() 
> to before running the constructor chain, as Takashi suggested.  That is what gave 
> me more problems.  I don't recall what they were, but I reverted that attempt.
> 
> The "future malloc" build of Cygwin I'm running doesn't exhibit Ken's issue, as 
> far as I can tell.  It has a specific fix to avoid the scenario I've been talking 
> about here, but I don't want to take us down that path unless we're sure Ken's 
> hitting that same scenario.

I tried the following patch, and confirmed that the issue has
been disappeared. I do not notice any other problems so far
with this patch.

diff --git a/winsup/cygwin/dcrt0.cc b/winsup/cygwin/dcrt0.cc
index 6f4723bb0..0d541ec14 100644
--- a/winsup/cygwin/dcrt0.cc
+++ b/winsup/cygwin/dcrt0.cc
@@ -773,6 +773,10 @@ dll_crt0_0 ()
   do_global_ctors (&__CTOR_LIST__, 1);
   cygthread::init ();
 
+  /* malloc_init() has been moved from dll_crt0_1() to here so that
+     malloc() can be called in fixup_after_exec(). */
+  malloc_init ();
+
   if (!child_proc_info)
     {
       setup_cygheap ();
@@ -857,7 +861,7 @@ dll_crt0_1 (void *)
      on a functioning malloc and it's possible that the user's program may
      have overridden malloc.  We only know about that at this stage,
      unfortunately. */
-  malloc_init ();
+  /* malloc_init() has been moved to dll_crt0_0(). */
   user_shared->initialize ();
 
 #ifdef CYGHEAP_DEBUG


Where is the "constructor chain" you mentioned?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>


More information about the Cygwin-developers mailing list