This is the mail archive of the
mailing list for the glibc project.
Re: powerpc __tls_get_addr call optimization
- From: Alan Modra <amodra at gmail dot com>
- To: Carlos O'Donell <carlos at redhat dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Fri, 20 Mar 2015 18:25:02 +1030
- Subject: Re: powerpc __tls_get_addr call optimization
- Authentication-results: sourceware.org; auth=none
- References: <20150318061145 dot GE24573 at bubble dot grove dot modra dot org> <5509B0D4 dot 2020903 at redhat dot com> <20150319025631 dot GC28603 at bubble dot grove dot modra dot org> <550B94FC dot 3070903 at redhat dot com>
On Thu, Mar 19, 2015 at 11:33:16PM -0400, Carlos O'Donell wrote:
> On 03/18/2015 10:56 PM, Alan Modra wrote:
> > On Wed, Mar 18, 2015 at 01:07:32PM -0400, Carlos O'Donell wrote:
> >> On 03/18/2015 02:11 AM, Alan Modra wrote:
> >>> Now that Alex's fixes for static TLS have gone in, I figure it's worth
> >>> revisiting an old patch of mine.
> >>> https://sourceware.org/ml/libc-alpha/2009-03/msg00053.html
> >> I'm not against this patch, but it certainly seems like you would be
> >> better served by just implementing tls descriptors?
> > I think this is one better than tls descriptors, because powerpc
> > avoids the indirect function call used by tls descriptors.
> You mean to say it is "faster" than tls descriptors, but at the same
To be honest, there isn't much difference in the optimized case where
static TLS is available. It boils down to an indirect call to a
function that loads one value vs. a direct call to a stub that loads
two values and compares one against zero. I think what I've
implemented is slightly better for PowerPC, but whether that would
carry over to other architectures is debatable.
> time "harder" to maintain because it's a custom implementation that
> anyone debugging glibc has to learn about. That's not a bad thing,
> I just want us all to acknowledge the tradeoff.
Well, yes, but the PowerPC implementation is all in dl-machine.h, and
looks very similar to x86_64 in use of CHECK_STATIC_TLS,
TRY_STATIC_TLS and modification of the tls_index entry. PowerPC
doesn't have the complication and potential failure of allocating
extended descriptors. We also don't need to pass extra flags to gcc
to enable the optimization.
> The present goal for glibc and the toolchain in general has been
> to move to TLS descriptors, and thus provide a way for the dozen or
> so packages in the distribution to stop doing this:
> mesa (src/mapi/u_current.h):
> extern __thread struct mapi_table *u_current_table
> They would instead use TLS descriptors, and the above markings would
> be removed and the access would be as fast as possible without needing
> to specify the IE model.
> These packages are sometimes linked with applications, and sometimes
> arbitrarily dlopened.
> Would this present optimization you propose for power support this
> use case?
Sure. This is exactly the use case the powerpc optimization tackles,
shared libraries using general dynamic or local dynamic TLS access.
Like TLS descriptors, it can also handle general dynamic or local
dynamic TLS access in an executable, but these will normally be
optimized to IE or LE by GNU ld.
> Would it use static TLS for the above access if it could and fall
> back gracefully if it can't?
> What I want to make sure is that Power isn't left behind when we
> eventually transition everyone else to TLS Descriptors and remove
> the above markings from source programs.
Other architectures left behind by the PowerPC implementation might
like to transition from TLS descriptors. Just kidding. :)
Australia Development Lab, IBM