This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: static TLS exhausted on ppc64le


On Mon, Sep 30, 2019 at 04:13:07PM +0000, Szabolcs Nagy wrote:
> On 30/09/2019 17:06, Rich Felker wrote:
> > On Mon, Sep 30, 2019 at 05:47:29PM +0200, Florian Weimer wrote:
> >> * Szabolcs Nagy:
> >>
> >>> On 30/09/2019 15:02, Dan Horák wrote:
> >>>> Hi,
> >>>>
> >>>> I would like to open a problem we have already met twice in Fedora. the
> >>>> symptom is
> >>>>
> >>>> "/lib64/libgomp.so.1: cannot allocate memory in static TLS block"
> >>>>
> >>>> usually when loading a lot of libraries/modules into a Python
> >>>> application. It happened on ppc64le and also on aarch64 systems.
> >>>>
> >>>> We have 2 reports in Fedora bugzilla about with more details.
> >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1722181
> >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1738752
> >>>>
> >>>> We have already discussed that briefly with Florian and other members of
> >>>> the Red Hat toolchain team, but outcome was in form of a recommendation
> >>>> to reduce the usage of "static TLS" objects in the individual libraries.
> >>>> But the open question still is - is there a fix for the TLS space
> >>>> exhaustion? I believe it can easily become a more serious problem soon.
> >>>
> >>> (a workaround is preloading the problematic libs at startup time)
> >>>
> >>> i think it's a bug in libgomp.so.1, gcc should not build
> >>> broken dsos by default (unless it can ensure they are never
> >>> loaded dynamically):
> >>>
> >>> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libgomp/configure.tgt;h=b88bf72fe3de3735929635c874b8da375c841b1d;hb=HEAD#l13
> >>
> >> I like the simplicity of initial-exec TLS.
> > 
> > I wouldn't really characterize it as simplicity. It's a trade of
> > complex (at least to the user) constraints on whether or not it works
> > for some simplicity of implementation.
> > 
> > I guess for glibc at present, there's a lot more complexity to dynamic
> > models because of lazy allocation and installation and generation
> > counters, and these interact with AS-safety and failsafety in
> > undesirable ways. I'd like to see that fixed but I know it's a big
> > change.
> > 
> >> I think there was a change on POWER to use the static TLS reservation
> >> for dynamic TLS, as an optimization.  Obviously, that's going to hurt
> >> those cases where a library with initial-exec TLS is loaded late, even
> >> if the static TLS reservation would ordinarily be large enough.
> > 
> > Was that because of the PLT-stub hack on powerpc done in lieu of
> > tlsdesc? That should really be abandoned entirely IMO, since it
> > *doesn't* give you any of the benefit of tlsdesc -- the whole point is
> > not the short code path but avoiding register spills for the standard
> > ABI call to __tls_get_addr, and the powerpc hack doesn't let you avoid
> > them. Real tlsdesc should be added to powerpc.
> 
> the problem is TRY_STATIC_TLS (defined in dynamic-link.h)
> 
> when it is used for dynamic tls (on targets where that's
> possible: tlsdesc or ppc tls opt hack) it will eat the
> preallocated static tls. (that's why this affects aarch64
> and powrpc64)
> 
> i think that logic can be easily changed so the preallocated
> tls area is not used for normal dynamically loaded dsos
> (assuming the intention of the prealloc tls is purely to
> support dsos with initial-exec tls), that's less optimal for
> the common dynamic tls use-case, but makes libgomp etc work.

I see. That's a shame, because if you have excess static TLS reserved,
using it for tlsdesc is actually really nice -- it makes the accesses
just as fast as initial-exec, but opportunistically, and falls back
gracefully if you run out. Waiting to hand it out to badly-behaved
libraries that are using initial-exec model only serves to reinforce
the bad behavior and discourages adoption of tlsdesc since the bad
behavior gets preferential treatment...

I think this analysis further supports my previous remarks that
initial-exec in dlopened libraries should be deprecated and EOL'd.

Rich


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]