This is the mail archive of the
mailing list for the glibc project.
Re: powerpc __tls_get_addr call optimization
- From: "Carlos O'Donell" <carlos at redhat dot com>
- To: Alan Modra <amodra at gmail dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Fri, 20 Mar 2015 09:54:30 -0400
- Subject: Re: powerpc __tls_get_addr call optimization
- Authentication-results: sourceware.org; auth=none
- References: <20150318061145 dot GE24573 at bubble dot grove dot modra dot org> <5509B0D4 dot 2020903 at redhat dot com> <20150319025631 dot GC28603 at bubble dot grove dot modra dot org> <550B94FC dot 3070903 at redhat dot com> <20150320075502 dot GC26234 at bubble dot grove dot modra dot org>
On 03/20/2015 03:55 AM, Alan Modra wrote:
> On Thu, Mar 19, 2015 at 11:33:16PM -0400, Carlos O'Donell wrote:
>> On 03/18/2015 10:56 PM, Alan Modra wrote:
>>> On Wed, Mar 18, 2015 at 01:07:32PM -0400, Carlos O'Donell wrote:
>>>> On 03/18/2015 02:11 AM, Alan Modra wrote:
>>>>> Now that Alex's fixes for static TLS have gone in, I figure it's worth
>>>>> revisiting an old patch of mine.
>>>> I'm not against this patch, but it certainly seems like you would be
>>>> better served by just implementing tls descriptors?
>>> I think this is one better than tls descriptors, because powerpc
>>> avoids the indirect function call used by tls descriptors.
>> You mean to say it is "faster" than tls descriptors, but at the same
> To be honest, there isn't much difference in the optimized case where
> static TLS is available. It boils down to an indirect call to a
> function that loads one value vs. a direct call to a stub that loads
> two values and compares one against zero. I think what I've
> implemented is slightly better for PowerPC, but whether that would
> carry over to other architectures is debatable.
I agree that what you have implemented is faster for power.
>> time "harder" to maintain because it's a custom implementation that
>> anyone debugging glibc has to learn about. That's not a bad thing,
>> I just want us all to acknowledge the tradeoff.
> Well, yes, but the PowerPC implementation is all in dl-machine.h, and
> looks very similar to x86_64 in use of CHECK_STATIC_TLS,
> TRY_STATIC_TLS and modification of the tls_index entry. PowerPC
> doesn't have the complication and potential failure of allocating
> extended descriptors. We also don't need to pass extra flags to gcc
> to enable the optimization.
I also agree that your present implementation mirrors TLS DESC in
the implementation and reuse of CHECK_STATIC_TLS/TRY_STAIC_TLS,
and I like that aspect of the change.
>> The present goal for glibc and the toolchain in general has been
>> to move to TLS descriptors, and thus provide a way for the dozen or
>> so packages in the distribution to stop doing this:
>> mesa (src/mapi/u_current.h):
>> extern __thread struct mapi_table *u_current_table
>> They would instead use TLS descriptors, and the above markings would
>> be removed and the access would be as fast as possible without needing
>> to specify the IE model.
>> These packages are sometimes linked with applications, and sometimes
>> arbitrarily dlopened.
>> Would this present optimization you propose for power support this
>> use case?
> Sure. This is exactly the use case the powerpc optimization tackles,
> shared libraries using general dynamic or local dynamic TLS access.
> Like TLS descriptors, it can also handle general dynamic or local
> dynamic TLS access in an executable, but these will normally be
> optimized to IE or LE by GNU ld.
Perfect, just making sure were were on the same page. I figured, after
reading the binutils patch this is mostly operated like TLS DESC, but
slightly optimized for power.
>> Would it use static TLS for the above access if it could and fall
>> back gracefully if it can't?
Good. I expected that it would simply degenerate to a call to
__tls_get_addr if it can't get static tls space.
>> What I want to make sure is that Power isn't left behind when we
>> eventually transition everyone else to TLS Descriptors and remove
>> the above markings from source programs.
> Other architectures left behind by the PowerPC implementation might
> like to transition from TLS descriptors. Just kidding. :)
Given your answers above I'm happy to see this go into glibc.
The patch itself looks fine to me, the real magic is in binutils
with yet another super-secret stub that has no debug information
and must be recognized by memory by the person doing the debugging :}