[PATCH] Avoid using up static TLS surplus for optimizations [BZ #25051]

Szabolcs Nagy szabolcs.nagy@arm.com
Fri Jun 19 16:44:16 GMT 2020


The 06/18/2020 15:50, Carlos O'Donell wrote:
> On 5/22/20 10:49 AM, Szabolcs Nagy wrote:
> > On some targets static TLS surplus area can be used opportunistically
> > for dynamically loaded modules such that the TLS access then becomes
> > faster (TLSDESC and powerpc TLS optimization). However we don't want
> > all surplus TLS to be used for this optimization because dynamically
> > loaded modules with initial-exec model TLS can only use surplus TLS.
> > 
> > Currently TLS_STATIC_SURPLUS is 1664 bytes which is not even enough to
> > cover the IE TLS needed for libc.so when DL_NNS (== 16) namespaces are
> > created, so to allow reliable use of dlmopen as well as dynamic TLS
> > optimizations a new contract is specified for use of static TLS:
> > 
> > - libc.so can have up to 192 bytes of IE TLS,
> > - other system libraries together can have up to 144 bytes of IE TLS.
> > - By default 512 bytes of "optional" static TLS is available for
> >   opportunistic use.
> > - By default at most 4 dlmopen namespaces are supported.
> > 
> > So the surplus TLS requirement is 3*192 + 4*144 + 512 = 1664 bytes
> > with dynamic linking (i.e. the same as before: the externally visible
> > behaviour is not changed, other than limiting static TLS use for
> > optimizations on affected targets.)
> > 
> > The optional TLS available for opportunistic use is now tunable
> > (dl.optional_static_tls), so users can directly affect the allocated
> > static TLS size. (Note that module unloading with dlclose does not
> > reclaim static TLS. After the optional TLS runs out, TLS access
> > is no longer optimized.)
> > 
> > Since users may need more dlmopen namespaces (5 .. 16) the maximum
> > supported number is now a tunable (dl.nns), when it is increased the
> > static TLS allocation increases according to the contract. If users
> > use more namespaces than the tunable setting, static TLS may run out.
> > Or if users dynamically load libraries with IE TLS beyond what's
> > allowed by the contract, static TLS may run out. These conditions are
> > not checked or enforced, but the user's responsibility.
> > 
> > Static linking used fixed 2048 bytes surplus TLS, this is changed
> > so the same contract is used as for dynamic linking.  However with
> > static linking DL_NNS == 1 so dl.nns tunable is forced to 1, so by
> > default the surplus TLS is reduced to 144 + 512 = 656 bytes. This
> > change is not expected to cause problems.
> > 
> > Tested on aarch64-linux-gnu and x86_64-linux-gnu.
> 
> Tested on x86_64 and i686 and passes without regression.
> 
> I know you're splitting this patch into two patches, but I went ahead
> and reviewed this anyway because it's in my queue. You can still split
> the patches if you want and I can review that again quickly and ACK.
> I wanted to get the bulk of the review done.

thanks. i did not make much progress with the spliting
but i will try to do it on monday, i think that's useful.

> 
> Subsequent followup after committing this:
> - We need to fix tst-manyaudit.
> - We should be able to count how many spaces we need based on LD_AUDIT
>   or DT_AUDIT and enable up to that amount.
> 
> Post a v2 and I think we're almost done.
> 
> See my recommendations below.
> 
> Tested-by: Carlos O'Donell <carlos@redhat.com>
...
> > --- a/csu/libc-start.c
> > +++ b/csu/libc-start.c
> > @@ -190,6 +190,8 @@ LIBC_START_MAIN (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
> >  
> >    __tunables_init (__environ);
> >  
> > +  _dl_static_tls_tunables_init ();
> 
> OK. We're starting to get an interesting set of initialization functions in LIBC_START_MAIN.

this init has to be after __tunables_init
and before _dl_determine_tlsoffset in the
shared case and before/in __libc_tls_setup
in the static case.

previously i moved it as early as possible
but now i think it should be together with
other tls setup code where the initialized
values first used.

and i will move the declaration to
sysdep/generic/ldsodefs.h since other
tls related declarations live there.

> > @@ -317,7 +318,10 @@ modules-names = testobj1 testobj2 testobj3 testobj4 testobj5 testobj6 \
> >  		tst-dlopenfailmod1 tst-dlopenfaillinkmod tst-dlopenfailmod2 \
> >  		tst-dlopenfailmod3 tst-ldconfig-ld-mod \
> >  		tst-filterobj-flt tst-filterobj-aux tst-filterobj-filtee \
> > -		tst-auditlogmod-1 tst-auditlogmod-2 tst-auditlogmod-3
> > +		tst-auditlogmod-1 tst-auditlogmod-2 tst-auditlogmod-3 \
> > +		tst-tls-ie-mod0 tst-tls-ie-mod1 tst-tls-ie-mod2 \
> > +		tst-tls-ie-mod3 tst-tls-ie-mod4 tst-tls-ie-mod5
> 
> OK. Five new modules. Tests the new limit at 4 namespaces.

actually i use dlopen, not dlmopen, so it's
one namespace and only used more modules so
the test remains effective in case the
tls optimization logic varies a bit.

but it makes sense to have another test
that uses 3 dlmopen (then the ie tls modules
need smaller tls size so they fit with the
3 additional libc.so)

> > +      default: 4
> > +    }
> > +    optional_static_tls {
> > +      type: SIZE_T
> > +      default: 512
> 
> OK, though minval: 0 might be a good limit. No maximum. SXID_ERASE is good.

i will add the min.

> > +  mods[4] = load_and_access ("tst-tls-ie-mod4.so", "access4");
> > +  mods[5] = load_and_access ("tst-tls-ie-mod5.so", "access5");
> 
> Why can't we load a 6th module and test that it fails?
> 
> Does it terminate the process or just return a dlopen failure?

hm it should not crash, but report the error,
so i think that's a reasonable check to do.

and i will do the other suggested wording
changes in the comments and documentation.



More information about the Libc-alpha mailing list