malloc crash
Takashi Yano
takashi.yano@nifty.ne.jp
Tue Oct 26 08:52:29 GMT 2021
On Tue, 26 Oct 2021 01:30:13 -0700
Mark Geisert wrote:
> Replying to myself to correct something I wrote...
>
> Mark Geisert wrote:
> > Takashi Yano wrote:
> >> On Mon, 25 Oct 2021 16:36:50 -0700
> >> Mark Geisert wrote:
> >>> Ken Brown wrote:
> >>>> On 10/25/2021 5:29 PM, Mark Geisert wrote:
> >>>>> Corinna Vinschen wrote:
> >>>>>> Er... huh? So both threads are in a malloc function? This shouldn't
> >>>>>> have happened, given the clunky muto guarding malloc calls. This is
> >>>>>> really strange. Why's the muto not working here?
> >>>>>
> >>>>> Is it possible both threads have executed malloc_init()?
> >>>>> If so, the second one would reinit the muto.
> >>>>
> >>>> Or does the fifo_reader thread call a malloc function before the main thread has
> >>>> called malloc_init()? This would presumably cause __malloc_lock() to fail, but
> >>>> there's no error check.
> >>>
> >>> If there's a global constructor involved, that is known to happen. Constructors
> >>> are run from dll_crt0_0(), before malloc_init() is called from dll_crt0_1(). See
> >>> dcrt0.cc for the details.
> >>
> >> So how about moving malloc_init() call from dll_crt0_1() to dll_crl0_0()
> >> so that malloc() can be called in fixup_after_fork/exec()?
> >
> > It appears simple, but this is a touchy area of code. The _0 and _1 are two
> > separate phases of process startup. I'd want to hear Corinna's thoughts on this.
> >
> > I'd also like to verify somehow that this is the scenario Ken is hitting.
> >
> > When I was researching different mallocs for Cygwin I hit the constructor snag
> > repeatedly. I did try delaying the constructor-running until after malloc_init().
> > More problems. I did not try moving malloc_init() to before the constructor run.
>
> Apologies; this was many months ago. What I did try was moving the malloc_init()
> to before running the constructor chain, as Takashi suggested. That is what gave
> me more problems. I don't recall what they were, but I reverted that attempt.
>
> The "future malloc" build of Cygwin I'm running doesn't exhibit Ken's issue, as
> far as I can tell. It has a specific fix to avoid the scenario I've been talking
> about here, but I don't want to take us down that path unless we're sure Ken's
> hitting that same scenario.
I tried the following patch, and confirmed that the issue has
been disappeared. I do not notice any other problems so far
with this patch.
diff --git a/winsup/cygwin/dcrt0.cc b/winsup/cygwin/dcrt0.cc
index 6f4723bb0..0d541ec14 100644
--- a/winsup/cygwin/dcrt0.cc
+++ b/winsup/cygwin/dcrt0.cc
@@ -773,6 +773,10 @@ dll_crt0_0 ()
do_global_ctors (&__CTOR_LIST__, 1);
cygthread::init ();
+ /* malloc_init() has been moved from dll_crt0_1() to here so that
+ malloc() can be called in fixup_after_exec(). */
+ malloc_init ();
+
if (!child_proc_info)
{
setup_cygheap ();
@@ -857,7 +861,7 @@ dll_crt0_1 (void *)
on a functioning malloc and it's possible that the user's program may
have overridden malloc. We only know about that at this stage,
unfortunately. */
- malloc_init ();
+ /* malloc_init() has been moved to dll_crt0_0(). */
user_shared->initialize ();
#ifdef CYGHEAP_DEBUG
Where is the "constructor chain" you mentioned?
--
Takashi Yano <takashi.yano@nifty.ne.jp>
More information about the Cygwin-developers
mailing list