[PATCH][BZ #16214] Fix TLS access on S390 with -march=z10
Andreas Krebbel
krebbel@linux.vnet.ibm.com
Wed Nov 27 20:58:00 GMT 2013
Hi Carlos,
On 27/11/13 07:03, Carlos O'Donell wrote:
> Yes, _dl_sym and _dl_vsym use this e.g. ->do_sym->_dl_tls_symaddr.
>
> An asm with clobbers is the only way of making this work reliably.
>
> The asm could set r12 to GOT, call __tls_get_offset, restore r12
> (or list it clobbered along with all the register __tls_get_offset
> touches, and __tls_get_addr), save result, then the rest can be
> in C and does the math required to compute the result.
>
> You have to avoid the compiler using r12 for something else in the
> middle and the asm is the only way to avoid that.
When building with -fpic r12 is fixed and cannot be used by the compiler freely. So this is not
supposed to happen.
...
>>> #ifdef PIC
>>> # define TLS_IE(x) \
>>> - ({ unsigned long __offset; \
>>> + ({ unsigned long __offset, __save; \
>>> asm ("bras %0,1f\n" \
>>> "0:\t.quad " #x "@gotntpoff\n" \
>>> - "1:\tlg %0,0(%0)\n\t" \
>>> - "lg %0,0(%0,%%r12):tls_load:" #x \
>>> - : "=&a" (__offset) : : "cc" ); \
>>> + "1:\tlgr %1,%%r12\n\t" \
>>> + "larl %%r12,_GLOBAL_OFFSET_TABLE_\n\t" \
>>> + "lg %0,0(%0)\n\t" \
>>> + "lg %0,0(%0,%%r12):tls_load:" #x "\n\t" \
>>> + "lgr %%r12,%1" \
>>> + : "=&a" (__offset), "=&a" (__save) : : "cc" ); \
>>> (int *) (__builtin_thread_pointer() + __offset); })
>>> #else
>>> # define TLS_IE(x) \
>>>
>>
>> For this code to work the GOT pointer is not required to be in r12. Can't you just set it up in a
>> compiler chosen reg? Something like this:
>
> Why? This code attempts to emulate exactly what the compiler is going to do
> and should be representative of the instruction sequences you'd see being
> generated for TLS accesses. These macros are only ever used in testing and
> never anywhere else so their speed is not important.
>
> I suggest leaving the longer sequences that Siddhesh has here that save/restore
> r12 and follow the ABI.
All this happens inside the macro so I don't think the ABI is relevant here. It is only needed to
add the GOT pointer to an offset. Since we cannot do the add with a single instruction we need a
scratch register. This can be r12 or any other GPR (except r0 of course). I would prefer the
shorter variant because well ... it's shorter :)
Bye,
-Andreas-
More information about the Libc-alpha
mailing list