This is the mail archive of the
libc-ports@sources.redhat.com
mailing list for the libc-ports project.
Re: [RFC][BZ #1874] Fix assertion triggered by thread/fork interaction
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Mike Frysinger <vapier at gentoo dot org>
- Cc: libc-alpha at sourceware dot org, libc-ports at sourceware dot org
- Date: Sat, 11 Jan 2014 13:07:37 +0100
- Subject: Re: [RFC][BZ #1874] Fix assertion triggered by thread/fork interaction
- Authentication-results: sourceware.org; auth=none
- References: <20131009200534 dot GA4300 at domone dot podge> <201401021718 dot 23793 dot vapier at gentoo dot org> <20140102235440 dot GB5026 at domone> <201401022107 dot 03048 dot vapier at gentoo dot org>
On Thu, Jan 02, 2014 at 09:07:02PM -0500, Mike Frysinger wrote:
> On Thursday 02 January 2014 18:54:40 OndÅej BÃlka wrote:
> > On Thu, Jan 02, 2014 at 05:18:22PM -0500, Mike Frysinger wrote:
> > > On Wednesday 09 October 2013 16:05:34 OndÅej BÃlka wrote:
> > > > Details:
> > > >
> > > > If a thread happens to hold dl_load_lock and have r_state set to RT_ADD
> > > > or RT_DELETE at the time another thread calls fork(), then the child
> > > > exit code from fork (in nptl/sysdeps/unix/sysv/linux/fork.c in our
> > > > case) re-initializes dl_load_lock but does not restore r_state to
> > > > RT_CONSISTENT. If the child subsequently requires ld.so functionality
> > > > before calling exec(), then the assertion will fire.
> > > >
> > > > The patch acquires dl_load_lock on entry to fork() and releases it on
> > > > exit from the parent path. The child path is initialized as currently
> > > > done. This is essentially pthreads_atfork, but forced to be first
> > > > because the acquisition of dl_load_lock must happen before
> > > > malloc_atfork is active to avoid a deadlock.
> > > > "
> > >
> > > doesn't seem right that we grab the lock and then just reset it in the
> > > child ? seems like you should just unlock it rather than reset it in the
> > > child.
> >
> > That part looks ok as without locking you could get inconsistent linker
> > structures.
>
> that's not what i meant. rather than make the call to reset in the child as
> the code in the tree does now, if you grab the lock early, why not use the
> normal unlock call in the child ? struct state would be fine.
>
possible.
> > > i'm also wary of code that already grabs a lot of locks trying to grab
> > > even more. the code paths that already grab the IO locks ... can they
> > > possibly grab this one too ? like a custom format handler that triggers
> > > loading of libs ?
> >
> > Do you haave a better solution? I send this to decide what to do with
> > this bug. I would not be surprised if we decided that it is invalid as
> > threads with fork cause problems in general.
>
> i think we want to support it without it being completely terrible.
>
> maybe the answer here is to reset all the dl state like we do with the lock ?
> is there some init func we could call ? i'm really not familiar with glibc
> ldso internals.
How would you handle dlclose with handle before fork? I still do not see
a clean solution.