This is the mail archive of the
mailing list for the glibc project.
Re: static TLS exhausted on ppc64le
On Mon, Sep 30, 2019 at 03:22:12PM -0400, Carlos O'Donell wrote:
> On 9/30/19 2:41 PM, Szabolcs Nagy wrote:
> > On 30/09/2019 19:07, Carlos O'Donell wrote:
> >> On 9/30/19 1:50 PM, Rich Felker wrote:
> >>> I see. That's a shame, because if you have excess static TLS reserved,
> >>> using it for tlsdesc is actually really nice -- it makes the accesses
> >>> just as fast as initial-exec, but opportunistically, and falls back
> >>> gracefully if you run out. Waiting to hand it out to badly-behaved
> >>> libraries that are using initial-exec model only serves to reinforce
> >>> the bad behavior and discourages adoption of tlsdesc since the bad
> >>> behavior gets preferential treatment...
> >>> I think this analysis further supports my previous remarks that
> >>> initial-exec in dlopened libraries should be deprecated and EOL'd.
> >> We should dig deeper into the analysis here, and I just ran readelf for
> >> all the implicated libraries in the bug.
> >> The only *real* problem here is the implementation, only libc and libgomp
> >> use TLS IE in this case (and libgl in the wild).
> >> I think the best steps towards resolution are:
> >> * Stop ppc64le and aarch64 from using ALL of the static TLS for tlsdesc / tls opt hack.
> >> - Reserve at least 128 bytes for libgomp + libgl.
> > i think this is easy and reasonable:
> > _dl_try_allocate_static_tls can reserve 128 bytes
> > which is only used by _dl_allocate_static_tls.
> > (so the optimization is still applied to dynamically
> > loaded dsos, but there is guaranteed tls for small
> > number if IE libs)
> >> * Fix lazy tls loading to stop being lazy about allocation and allocate all memory
> >> required up front.
> >> - This allows libc to use GD instead of IE and not worry about touching tls vars
> >> early before init or the ordering of IE vs. GD.
> >> - Requires a non-default dlopen flag to get back old behaviour.
> > this sounds hard.
> > (and observable to users with many threads and many dsos with tls:
> > preallocation will use more memory and cause slower thread starts
> > than lazy allocation)
> >> * Switch glibc back to GD internally.
> >> * Switch x86_64 to tlsdesc (can be done at any time) to get perf back.
> >> Disallowing IE in DSOs is only going to get us angry users in a transitional period.
> >> The above plan will benefit ppc64le and aarch64 since they continue to
> >> have maximum performance for their usage of tlsdesc.
> >> Thoughts?
> > i'm ok with the plan, but i thin only the first part
> > is realistic in short term.
> I agree the middle part is hard.
> If we fix this to reserve X bytes, then we can immediately start
> transitioning x86_64 to tlsdesc also without hitting these problems? :-)
I don't see how the middle part is terribly hard, but I agree the path
to transition x86_64 to tlsdesc first is a good one. Hopefully it will
make other good choices uncontroversial later. :-)