tlsdesc (default on aarch64) and the powerpc tls optimization allows dsos with general tls access to be more efficient if static tls (with fixed tp offset) is used (instead of dynamic tls via dtv indirections). however in case of dynamically loaded dsos glibc even uses the static tls surplus area for this optimization (i.e. when no static tls was allocated for the dso specifically). this means the surplus tls runs out more quickly compared to other targets where that is only used for dynamically loaded dsos with initial-exec tls access. since there are libraries that rely on initial-exec tls, glibc should reserve the surplus for that usage only. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91938 (this bug depends on the resolution of that gcc bug) i.e. TRY_STATIC_TLS should be replaced with HAVE_STATIC_TLS in tls reloc processing code. (initial-exec tls can continue to use CHECK_STATIC_TLS, but the macros may need documentation changes)
(In reply to Szabolcs Nagy from comment #0) > i.e. TRY_STATIC_TLS should be replaced with HAVE_STATIC_TLS > in tls reloc processing code. That does sound quite reasonable. Shared libraries loaded at startup will still have the benefit of the tlsdesc optimisation.
posted a patch for discussion https://sourceware.org/ml/libc-alpha/2020-01/msg00099.html
We got bug reports about a similar issue in Debian [1] (which affects PyQt5 and PySide packages on aarch64). The latest version (v5) of the patch from [2] works, thanks! [1]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964141 [2]: https://sourceware.org/pipermail/libc-alpha/2020-June/115284.html
The master branch has been updated by Szabolcs Nagy <nsz@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ffb17e7ba3a5ba9632cee97330b325072fbe41dd commit ffb17e7ba3a5ba9632cee97330b325072fbe41dd Author: Szabolcs Nagy <szabolcs.nagy@arm.com> Date: Wed Jun 10 13:40:40 2020 +0100 rtld: Avoid using up static TLS surplus for optimizations [BZ #25051] On some targets static TLS surplus area can be used opportunistically for dynamically loaded modules such that the TLS access then becomes faster (TLSDESC and powerpc TLS optimization). However we don't want all surplus TLS to be used for this optimization because dynamically loaded modules with initial-exec model TLS can only use surplus TLS. The new contract for surplus static TLS use is: - libc.so can have up to 192 bytes of IE TLS, - other system libraries together can have up to 144 bytes of IE TLS. - Some "optional" static TLS is available for opportunistic use. The optional TLS is now tunable: rtld.optional_static_tls, so users can directly affect the allocated static TLS size. (Note that module unloading with dlclose does not reclaim static TLS. After the optional TLS runs out, TLS access is no longer optimized to use static TLS.) The default setting of rtld.optional_static_tls is 512 so the surplus TLS is 3*192 + 4*144 + 512 = 1664 by default, the same as before. Fixes BZ #25051. Tested on aarch64-linux-gnu and x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
fixed for 2.32.