ELF TLS technical details
Alan Modra
amodra@bigpond.net.au
Fri Mar 31 23:19:00 GMT 2006
On Wed, Mar 29, 2006 at 05:31:13PM -0600, Steve Munroe wrote:
> The 32-bit TLS design was simular. Alan, Paul, Do you have the PowerPC
> 32-bit TLS documentation?
Attached.
--
Alan Modra
IBM OzLabs - Linux Technology Centre
-------------- next part --------------
PowerPC Specific Thread Local Storage ABI
For insertion in http://people.redhat.com/drepper/tls.pdf
3.4.x PowerPC32 Specific
-------------------------
The PowerPC32 TLS ABI is similar to the PowerPC64 model. The thread-local
storage data structures follow variant I. The TCB is 8 bytes, with the
first 4 bytes containing the pointer to the dynamic thread vector.
tlsoffset calculations and definition of __tls_get_addr are identical to
PowerPC64. r2 is the thread pointer, and points 0x7000 past the end of the
thread control block. Dynamic thread vector pointers point 0x8000 past the
start of each TLS block. (*) This allows the first 64K of each block to
be addressed from a dtv pointer using fewer machine instructions. The tp
offset allows for efficient addressing of the TCB and up to 4K-8 of other
thread library information.
(*) For implementation reasons the actual value stored in dtv may point to
the start of a block, however values returned by accessor functions will be
offset by 0x8000.
4.1.x PowerPC32 General Dynamic TLS Model
------------------------------------------
The PowerPC32 general dynamic access model is similar to that for PowerPC64.
The __tls_get_addr function is called with one parameter which is a pointer
to an object of type tls_index. In the following code it is assumed that
register r31 points to the GOT. Different registers may well be used.
Code sequence Reloc Sym
addi 3,31,x@got@tlsgd R_PPC_GOT_TLSGD16 x
bl __tls_get_addr R_PPC_REL24 __tls_get_addr
GOT[n] R_PPC_DTPMOD32 x
GOT[n+1] R_PPC_DTPREL32 x
The relocation specifier @got@tlsgd causes the linker to create an object
of type tls_index in the GOT. The address of this object is loaded into
the first argument register with the addi instruction, then a standard
function call is made.
4.2.x PowerPC32 Local Dynamic TLS Model
----------------------------------------
This is similar to other architectures. Two different sequences may be
used, depending on the size of the offset to the variable.
Code sequence Reloc Sym
addi 3,31,x1@got@tlsld R_PPC_GOT_TLSLD16 x1
bl __tls_get_addr R_PPC_REL24 __tls_get_addr
..
addi 9,3,x1@dtprel R_PPC_DTPREL16 x1
..
addis 9,3,x2@dtprel@ha R_PPC_DTPREL16_HA x2
addi 9,9,x2@dtprel@l R_PPC_DTPREL16_LO x2
GOT[n] R_PPC_DTPMOD32 x1
GOT[n+1] 0
@got@tlsld in the first instruction causes the linker to generate a
tls_index object in the GOT with a fixed 0 offset. The code shown assumes
that x1 is in the first 64k of the thread storage block, while x2 isn't.
If we wanted to load the values of x1 and x2 instead of the address, then
we could access int variables with
..
lwz 0,x1@dtprel(3) R_PPC_DTPREL16 x1
..
addis 9,3,x2@dtprel@ha R_PPC_DTPREL16_HA x2
lwz 0,x2@dtprel@l(9) R_PPC_DTPREL16_LO x2
4.3.x PowerPC32 Initial Exec TLS Model
---------------------------------------
Code sequence Reloc Sym
lwz 9,x@got@tprel(31) R_PPC_GOT_TPREL16 x
add 9,9,x@tls R_PPC_TLS x
GOT[n] R_PPC_TPREL32 x
@got@tprel in the first instruction causes the linker to generate a GOT
entry with a relocation that the dynamic linker will replace with the
offset for x relative to the thread pointer. x@tls tells the assembler to
use an r2 form of the instruction (ie. add 9,9,2 in this case), and tag the
instruction with a reloc that indicates it belongs to a TLS sequence. This
may be later used by the linker when optimizing TLS code.
To read the contents of the variable instead of calculating its address,
the "add 9,9,x@tls" instruction might be replaced with "lwzx 0,9,x@tls".
4.4.x PowerPC32 Local Exec TLS Model
-------------------------------------
Two different sequences may be used, depending on the size of the offset to
the variable. The first one handles offsets within 60K of the end of the
TLS block (remember that r2 points 28K past the end of the TCB, which is
immediately prior to the first TLS block).
Code sequence Reloc Sym
addi 9,2,x1@tprel R_PPC_TPREL16 x1
..
addis 9,2,x2@tprel@ha R_PPC_TPREL16_HA x2
addi 9,9,x2@tprel@l R_PPC_TPREL16_LO x2
5.x PowerPC32 Linker Optimizations
-----------------------------------
The linker transformations for PowerPC32 are quite straightforward, since
all the relevant code sequences are two instructions long.
5.x.1 General Dynamic To Initial Exec
--------------------------------------
Code sequence Reloc Sym
addi 3,31,x@got@tlsgd R_PPC_GOT_TLSGD16 x
bl __tls_get_addr R_PPC_REL24 __tls_get_addr
GOT[n] R_PPC_DTPMOD32 x
GOT[n+1] R_PPC_DTPREL32 x
is replaced by
lwz 3,x@got@tprel(31) R_PPC_GOT_TPREL16 x
add 3,3,2
GOT[n] R_PPC_TPREL32 x
The linker relies on this sequence being emitted without intervening
instructions. A register other than r31 may be used as the GOT pointer.
5.x.2 General Dynamic To Local Exec
------------------------------------
Code sequence Reloc Sym
addi 3,31,x@got@tlsgd R_PPC_GOT_TLSGD16 x
bl __tls_get_addr R_PPC_REL24 __tls_get_addr
GOT[n] R_PPC_DTPMOD32 x
GOT[n+1] R_PPC_DTPREL32 x
is replaced by
addis 3,2,x@tprel@ha R_PPC_TPREL16_HA x
addi 3,3,x@tprel@l R_PPC_TPREL16_LO x
The linker relies on this sequence being emitted without intervening
instructions. A register other than r31 may be used as the GOT pointer.
5.x.3 Local Dynamic to Local Exec
----------------------------------
In this case, the function call is replaced with an equivalent code
sequence. As shown, following dtprel sequences are left unchanged.
Code sequence Reloc Sym
addi 3,31,x1@got@tlsld R_PPC_GOT_TLSLD16 x1
bl __tls_get_addr R_PPC_REL24 __tls_get_addr
..
addi 9,3,x1@dtprel R_PPC_DTPREL16 x1
..
addis 9,3,x2@dtprel@ha R_PPC_DTPREL16_HA x2
addi 9,9,x2@dtprel@l R_PPC_DTPREL16_LO x2
GOT[n] R_PPC_DTPMOD32 x1
GOT[n+1]
is replaced by
addis 3,2,L@tprel@ha R_PPC_TPREL16_HA linker generated local sym
addi 3,3,L@tprel@l R_PPC_TPREL16_LO linker generated local sym
..
addi 9,3,x1@dtprel R_PPC_DTPREL16 x1
..
addis 9,3,x2@dtprel@ha R_PPC_DTPREL16_HA x2
addi 9,9,x2@dtprel@l R_PPC_DTPREL16_LO x2
The "linker generated local sym" points to the start of the thread storage
block plus 0x7000. In practice, a section symbol with a suitable offset
will be used. The linker relies on code for the tls_get_addr call being
emitted without intervening instructions. A register other than r31 may
be used as the GOT pointer.
5.x.4 Initial Exec To Local Exec
---------------------------------
Code sequence Reloc Sym
lwz 9,x@got@tprel(31) R_PPC_GOT_TPREL16 x
add 9,9,x@tls R_PPC64_TLS x
GOT[n] R_PPC_TPREL32 x
is replaced by
addis 9,2,x@tprel@ha R_PPC_TPREL16_HA x
addi 9,9,x@tprel@l R_PPC_TPREL16_LO x
Other sizes and types of thread-local variables may use any of the X-FORM
indexed loads or stores. The "lwz" and "add" instruction in this case may
have intervening code inserted by the compiler.
An example showing access to the contents of a variable:
Code sequence Reloc Sym
lwz 9,x@got@tprel(31) R_PPC_GOT_TPREL16 x
lbzx 10,9,x@tls R_PPC_TLS x
addi 10,10,1
stbx 10,9,x@tls R_PPC_TLS x
GOT[n] R_PPC_TPREL32 x
is replaced by
addis 9,2,x@tprel@ha R_PPC_TPREL16_HA x
lbz 10,x@tprel@l(9) R_PPC_TPREL16_LO x
addi 10,10,1
stb 10,x@tprel@l(9) R_PPC_TPREL16_LO x
6.x New PowerPC32 ELF Definitions
----------------------------------
Reloc Name Value Field Expression
R_PPC_TLS 67 none (sym+add)@tls
R_PPC_DTPMOD32 68 word32 (sym+add)@dtpmod
R_PPC_TPREL16 69 half16* (sym+add)@tprel
R_PPC_TPREL16_LO 60 half16 (sym+add)@tprel@l
R_PPC_TPREL16_HI 71 half16 (sym+add)@tprel@h
R_PPC_TPREL16_HA 72 half16 (sym+add)@tprel@ha
R_PPC_TPREL32 73 word32 (sym+add)@tprel
R_PPC_DTPREL16 74 half16* (sym+add)@dtprel
R_PPC_DTPREL16_LO 75 half16 (sym+add)@dtprel@l
R_PPC_DTPREL16_HI 76 half16 (sym+add)@dtprel@h
R_PPC_DTPREL16_HA 77 half16 (sym+add)@dtprel@ha
R_PPC_DTPREL32 78 word32 (sym+add)@dtprel
R_PPC_GOT_TLSGD16 79 half16* (sym+add)@got@tlsgd
R_PPC_GOT_TLSGD16_LO 80 half16 (sym+add)@got@tlsgd@l
R_PPC_GOT_TLSGD16_HI 81 half16 (sym+add)@got@tlsgd@h
R_PPC_GOT_TLSGD16_HA 82 half16 (sym+add)@got@tlsgd@ha
R_PPC_GOT_TLSLD16 83 half16* (sym+add)@got@tlsld
R_PPC_GOT_TLSLD16_LO 84 half16 (sym+add)@got@tlsld@l
R_PPC_GOT_TLSLD16_HI 85 half16 (sym+add)@got@tlsld@h
R_PPC_GOT_TLSLD16_HA 86 half16 (sym+add)@got@tlsld@ha
R_PPC_GOT_TPREL16 87 half16* (sym+add)@got@tprel
R_PPC_GOT_TPREL16_LO 88 half16 (sym+add)@got@tprel@l
R_PPC_GOT_TPREL16_HI 89 half16 (sym+add)@got@tprel@h
R_PPC_GOT_TPREL16_HA 90 half16 (sym+add)@got@tprel@ha
(sym+add)@tls
Merely causes the R_PPC_TLS marker reloc to be emitted.
(sym+add)@dtpmod
Computes the load module index of the load module that contains the
definition of sym. The addend, if present, is ignored.
(sym+add)@dtprel
Computes a dtv-relative displacement, the difference between the value
of sym+add and the base address of the thread-local storage block that
contains the definition of sym, minus 0x8000. The minus 0x8000 is because
dtv elements point to the start of the storage block plus 0x8000.
(sym+add)@tprel
Computes a tp-relative displacement, the difference between the value of
sym+add and the value of the thread pointer (r2).
(sym+add)@got@tlsgd
Allocates two contiguous entries in the GOT to hold a tls_index structure,
with values (sym+add)@dtpmod and (sym+add)@dtprel, and computes the offset
of the first entry within the GOT.
(sym+add)@got@tlsld
Allocates two contiguous entries in the GOT to hold a tls_index structure,
with values (sym+add)@dtpmod and zero, and computes the offset of the first
entry within the GOT.
(sym+add)@got@tprel
Allocates an entry in the GOT with value (sym+add)@tprel, and computes the
offset of the entry within the GOT.
@l, @h
These modifiers affect the value computed, returning the low 16 bits or the
high 16 bits of a 32 bit value.
@ha
This modifier is like the corresponding @h modifier, except it adjusts for
@l being treated as a signed number.
Relocations not using these modifiers (those flagged with `*' above) will
trigger a relocation failure if the value computed does not fit in the
field specified.
Local variables:
fill-column: 75
End:
More information about the Libc-alpha
mailing list