This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Questions about powerpc __tls_get_addr optimization

From: Alan Modra <amodra at gmail dot com>
To: Rich Felker <dalias at libc dot org>
Cc: libc-alpha at sourceware dot org
Date: Fri, 12 Oct 2018 11:38:14 +1030
Subject: Re: Questions about powerpc __tls_get_addr optimization
References: <20181011212654.GA20668@brightrain.aerifal.cx>

On Thu, Oct 11, 2018 at 05:26:54PM -0400, Rich Felker wrote:
> Alan, can you (or anyone) shed some light on why the DT_PPC_OPT flag
> is needed for the dynamic linker to be able to apply your
> __tls_get_addr optimization? Assuming the dynamic linker implements
> the real function __tls_get_addr, it could do the same modid==0 check
> itself, without needing assistance from ld.

I think you are correct.

> My concern with doing this would be that there's no relocation on the
> second (offset) slot when the local-dynamic model is used, and I
> thought ld would be doing something special to account for this when
> the optimization is used, but apparently your code in the glibc
> dynamic linker just ignores this and fills in both slots when
> processing the R_PPC64_DTPMOD64 relocation.

Right, the slots are filled in for local dynamic at that point.

> Is this valid, i.e. is is
> valid to assume that a corresponding R_PPC64_DTPREL64 relocation will
> come after the R_PPC64_DTPMOD64 relocation, if there is one? Is this
> assumption valid for other targets as well?

Yes, it is a reasonable assumption if you are using a sane assembler
and linker, and you're not deliberately trying to create out of order
relocations.  The assembler will normally emit the DTPREL reloc after
the DTPMOD one by virtue of the DTPREL word appearing after the DTPMOD
word.  Linkers generally emit dynamic relocations in the same order as
source relocations, or sort in a manner that guarantees they are
ordered sensibly.  In the case of GNU ld, -z combreloc will always
place the DTPREL reloc after the corresponding DTPMOD reloc because
they have the same symbol and r_offset for the DTPREL is greater than
r_offset for the DTPMOD.  This should be true for other targets too.

> Are there other reasons I'm missing that DT_PPC_OPT is needed in order
> for it to be valid for the dynamic linker to use this technique? Or is
> it just that you only wanted to implement the zero check in the PLT
> stub and not repeat it in the __tls_get_addr function?

I can't remember for sure what I was thinking at the time I
implemented the feature for PowerPC.  It's quite possible I didn't
even think of the possibility that __tls_get_addr could check the
modid.  But if I had, I may have rejected the idea as costing a little
extra at run time.

Incidentally, one of the gains in this optimization comes from
avoiding the __tls_get_addr call (the second and subsequent times) and
the inevitable load-hit-store on the r2 save for a short duration
function.

-- 
Alan Modra
Australia Development Lab, IBM

Follow-Ups:
- Re: Questions about powerpc __tls_get_addr optimization
  - From: Rich Felker

References:
- Questions about powerpc __tls_get_addr optimization
  - From: Rich Felker

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]