This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PR18457] Don't require rtld lock to compute DTV addr for static TLS


On Wed, Jun 03, 2015 at 01:51:32PM -0300, Alexandre Oliva wrote:
> It sure isn't good enough to enable anyone to do whatever they like in a
> module finalizer, including blocking waiting for other threads that
> interact, explicitly or implicitly, with the dynamic loader before the
> finalizer is unblocked.  Most of these issues are preexisting, and
> explicitly not addressed by the patch.
> 
> It is good enough, however, to fix the regression, namely, that
> ___tls_get_addr for an IE variable used to be able to complete even if
> the rtld lock was taken by another thread, while now it's slower than
> needed and it may deadlock.  The patch is intended to fix this
> performance and deadlock regression, not any of the preexisting
> problems.

Here's the core of my lack of understanding I guess - calling
__tls_get_addr for an IE variable.  Does that ever happen?  An IE
variable resolution does not result in a call to __tls_get_addr
AFAICT.  Maybe you mean a variable that could be relaxed as an IE but
was compiled to use GD?

> I very much doubt it would have completed before the DTV race fix.
> ___tls_get_addr would find the l_tls_offset for tst-join7mod.so to be
> undecided, and would attempt to take the lock to make it forced dynamic.
> Deadlock.

I haven't tested that, but I did test with master and it works just
fine.  This is because __tls_get_addr doesn't get called at all when
the variables are set as IE.  And for that reason, I don't see why it
shouldn't work otherwise.

> It's not like accessing TLS variables is special WRT interacting
> implicitly with the dynamic loader.  It's just the tip of the iceberg.
> Any lazily-bound function called for the first time will do so as well.
> libdl functions will also interact with the dynamic loader, obviously,
> and there are various cases in which localization libraries deeply
> hidden in library internals will dlopen modules to translate an error
> message or somesuch.  There are so many things that can go wrong if a
> module initializer or finalizer is called with the rtld lock held that
> I'm convinced the solution is not to go case after case after case
> trying to somehow make it lock-free, but rather to release the rtld lock
> before handing control over to user-supplied code (init and fini).  This
> would require, before releasing the lock, placing the modules in
> intermediate states and making a list of things without the lock and
> after taking the lock again, and then, after we're done with the
> lockless list and take the lock back, checking that the things to do are
> still appropriate.  We might even require additional corner-case states,
> such as a module that was to be finalized and released, but that got
> dlopened again by some other concurrent finalizer.  Tricky, but doable.
> 
> The alternative, IMHO, is not to go after each case that takes the rtld
> lock and somehow avoid that, but rather to document what can and cannot
> be done in module finalizers.  We might want to rule out synchronization
> with other threads, or just document that they are called while holding
> a lock that various infrastructure bits have to take at unanticipated
> times, so any such synchronization is prone to deadlocks.

Synchronization with other threads in a module finalizer seems like a
common use case.  For example, a module waiting for worker threads to
exit before being deleted.  The reproducer in fact is such a case.
The fact that the module doesn't have its own TLS is just sheer luck.

> It's not clear to me that we want to make libc.so variables IE.  If we
> do, there are a number of others that could benefit from this treatment.

These are static variables that are only referenced inside libc.so.
Also, libc.so will always be loaded before the executable is running.
I don't see why they shouldn't be IE.

> >    That has to go and we need
> >    to figure out another way to wait for another dlopen to complete.
> 
> I don't think so, but I won't try to stop you ;-)

:)

Siddhesh

Attachment: pgpRq3cZ0_cev.pgp
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]