This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: TLS redux

From: "Carlos O'Donell" <carlos at redhat dot com>
To: Roland McGrath <roland at hack dot frob dot com>, "GNU C. Library" <libc-alpha at sourceware dot org>
Date: Wed, 15 Jan 2014 12:45:04 -0500
Subject: Re: TLS redux
Authentication-results: sourceware.org; auth=none
References: <20140115022335 dot EB13174430 at topped-with-meat dot com>
On 01/14/2014 09:23 PM, Roland McGrath wrote:
> I've finally caught up on the long threads about TLS issues.
> (The good news is that this was a sizable fraction of all of my
> libc-related backlog, so I'm much less behind than I was before!)
> 
> Other people have discussed many of the issues that I would have
> raised if I'd participated all along, but not all of them.  I won't
> summarize the whole discussion, but just mention the things I think
> it's important not to overlook.  I don't really have anything to say
> about most of the implementation details.  Only the last point or two
> are issues about the changes being considered for 2.19.
> 
> * Lazy allocation is an explicit feature of the TLS ABI, not an
>   incidental detail.  The wisdom of the feature can be debated, but
>   the compatibility requirements are clear.
> 
>   It's a regression if this scenario stops working:
>   1. Start a thousand threads
>   2. dlopen a module containing __thread char buf[100 << 20];
>   3. Start another thousand threads
>   4. Call into the module on one thread so it uses its buf.
>   5. Start a third thousand threads
>   Now you should have 3000 threads but not 3000*100M memory use.
>   (Here I mean address space reservation, regardless of consumption
>   of machine resources, VM overcommit, etc.)
> 
>   At least in the case of an existing binary dlopen caller (which
>   could actually be either in an executable or in a DSO) and an
>   existing binary module loaded by that dlopen, such a regression is
>   an ABI break and cannot be tolerated.

I'm with Joseph on this one. We have to look for a way to detect this
kind of failure at dlopen time rather than when we try to access the
first TLS variable. There just has to be some kind of static analysis
we can do to determine this will eventually fail e.g. some kind of
internally tracked reservation system.

We still have GNU2 TLS to look at so this problem won't be going
away any time soon.

> * It's inherently impossible to both allocate lazily and have dynamic
>   TLS access that cannot fail.  Either you preallocate the memory
>   (eager use of address space, if not necessarily actual storage) or
>   attempting to allocate it later might fail.  Hence it must be an
>   explicit choice between the two.  That choice might be at the
>   granularity of the whole implementation, as in musl, or all the way
>   down to the granularity of an individual TLS-containing module or
>   individual module-loading call.  Since glibc has a compatibility
>   requirement to support lazy allocation, the only possibilities for
>   the contrary choice are at smaller granularities.

Agreed.

> * Eager allocation could be a new option, and could even be a new
>   default.  (What the default should be is a separate debate that does
>   not need to begin now.)
> ** e.g. A new DF_1_* flag and -z option for a DSO to request it.
> *** Could be made default for newly-built DSOs.
> ** New dlopen flag bits to request it.
> *** Could be made default for newly-built dlopen callers (i.e. new
>     symbol version of dlopen).

Sounds like a good idea.

> * In implementing eager allocation when multiple threads already
>   exist, it is theoretically possible to do all or almost all of it
>   asynchronously (i.e. all work done inside the dlopen call on the
>   thread that called it).  It's trickiest, or perhaps impossible, to
>   do the final step if the DTV needs to be expanded, from another
>   thread.  But there is not really any good reason to do a lot from
>   other threads.  Rich Felker described the most sensible
>   implementation strategy: do all the allocation in dlopen, but only
>   actually install those new pointers synchronously in each thread,
>   inside __tls_get_addr.

I agree.

> * The main request for async-signal-safe TLS use is satisfied by "fail
>   safe" semantics that preserve lazy allocation semantics: if the
>   memory is really not available, then you crash gracefully in
>   __tls_get_addr.  (That is, as much grace as abort, as opposed to the
>   full range of "undefined behavior" or anything like deadlock.)

Right, I think "fail safe" is always in all of our minds. We don't
want any of this to degenerate into undefined behaviour (like our
current thread cancellation implementation)

> * How to find all memory containing direct application data is a de
>   facto part of our ABI.  By "direct" I mean objects that the
>   application touches itself.  That includes __thread variables just
>   as it includes global, static, and auto variables.  It excludes
>   library-maintained caches and the like, but includes any user data
>   that the public API implies the library holds onto, such as pointers
>   stored by <search.h> functions.
> 
>   This is a distinct issue from the general subject of "using an
>   alternate allocation mechanism for memory" that Carlos mentioned.
>   If libc changes how and where it stores its own internal data, that
>   does not impinge on anything that is a de facto part of the ABI.  If
>   libc changes how and where it stores application TLS data or other
>   things in the aforementioned category, that is another thing entirely.
> 
>   I mentioned ASan as just one example of the kinds of things that
>   might care about these aspects of the de facto ABI.  Things like
>   ASan and conservative GC implementations are the obvious examples.
>   But the fundamentals of conservatism dictate that we not make a
>   priori assumptions about what our users are doing and what matters
>   to them.  As with all somewhat fuzzy aspects of the ABI, there will
>   be a pragmatic balancing test between "I was using that, you can't
>   break it!" and, "You were broken to have been relying on that."  But
>   we must consider it explicitly, discuss it pragmatically, and be
>   circumspect about changes, especially the subtle ones.  The change
>   at issue here is especially subtle in that it could be a silent time
>   bomb that does not affect anybody in practice (or that nobody
>   realizes explains strange new flakiness they experience) for
>   multiple release cycles.  For example, if before the change a
>   __thread variable (in a dynamic TLS module) sometimes was the only
>   root holding a GC'able pointer and the GC noticed it there, but
>   after the change the GC doesn't see that root.  If this bug is
>   introduced tomorrow, it could be a long time before the confluence
>   of when collections happen, whether other objects hold (or appear to
>   hold) the same pointer, and the effects of reclamation, add up to
>   make someone experience a failure they notice.
> 
>   How to find threads' stacks and static TLS areas is already
>   underspecified (improving that situation is a subject for another
>   discussion).  But even for that, we would be quite circumspect about
>   making a change that could break methods existing programs are using
>   to acquire that information.
> 
>   Today, dynamic TLS areas are allocated using the public malloc
>   interface.  Programs or GC libraries or whatnot can today supply a
>   malloc replacement, scan the static data+bss area, scan each
>   thread's stack and static TLS area, and reliably discover every byte
>   of user data in any TLS area for any thread.  We never documented an
>   explicit guarantee that this works, but it does and to the extent
>   that anything extant relies on it (whether or not its maintainers
>   realize they do!), it is part of the de facto ABI.
> 
>   I don't have a firm conclusion about what we guarantees of this sort
>   we should or should not be offering or preserving.  But this change
>   affects that part of the de facto ABI and as far as I noticed nobody
>   has discussed it at all.  That fails the conservatism test instantly.

I'll be honest I had not considered this a part of the ABI at all,
which is why it wasn't part of the discussion.

It will take *years* I think for all of us new core maintainers to
come to understand each others definitions of "conservatism" and
to align what we mean by that towards a common goal.

I appreciate you writing this up otherwise I have no idea what
you're thinking about when you say "conservatism" (being a single
word to describe a wide range of ideas you have come to learn
over your experience developing glibc).

> I have no great quarrel with the thoroughness or conservatism of the
> vetting of the implementation details or first-order ABI issues of
> what's gone in.  (I am not entirely sanguine about all that, but close
> enough that I've decided not to participate in the detailed review.)
> But the mere fact that in a few months a >100 messages of discussion,
> I'm the first to raise these subtleties (that I really thought would
> have been fairly obvious to people here) gives me great pause about
> the whole endeavor.

I didn't see it like that. The >100 messages was more about
making sure the QoI was high, and that everyone understood the
limitations of the implementation. There was also discussions about
doing immediate eager allocations, but I didn't consider any of
those as serious suggestions given the present implementation.

> Similarly, Carlos expressed an attitude that I'll summarize as, "So we
> break ASan for a release or three and fix it later, no big deal."
> That is fundamentally anti-conservative IMHO.  Indeed, ASan is not
> part of glibc.  If it were, we'd be able to achieve complete
> confidence about all its issues very quickly.  ASan is an example of
> the wide variety of things users are doing with glibc, that we have an
> obligation never to break silently or inadvertently.

Thanks for that feedback. I'll continue to calibrate my definition
of "conservative."

> Dynamic TLS access not being async-signal-safe has been the status quo
> since the inception of the TLS features.  Leaving that as it is for
> another release is just obviously acceptably conservative.
> Contrarily, breaking other kinds of subtle interaction with TLS
> features that have worked in practice heretofore is not conservative
> at all.

I disagree with you here and agree with Rich. The fact that
dynamic TLS access was not AS-safe was a timebomb waiting to hit
any of users. It still remains a problem for GNU2 TLS, but that
isn't the default on most targets yet. Google hitting this problem
wasn't surprising given the work they do.

> As I said, I'm not specifying any conclusions.  I'm fairly confident
> we can find a middle road that is appropriately conservative while
> offering improvement for the pain point.  But we have yet to even
> begin discussing what IMHO should be considered a major obstacle to
> making this change while keeping with our conservative principles.

I agree.

Cheers,
Carlos.
References:
- TLS redux
  - From: Roland McGrath
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]