This is the mail archive of the
mailing list for the glibc project.
Re: Failure to dlopen libgomp due to static TLS data
- From: Rich Felker <dalias at libc dot org>
- To: Andrew Haley <aph at redhat dot com>
- Cc: Jakub Jelinek <jakub at redhat dot com>, Ulrich Weigand <uweigand at de dot ibm dot com>, libc-alpha at sourceware dot org, gcc at gcc dot gnu dot org, rth at redhat dot com
- Date: Fri, 13 Feb 2015 17:23:57 -0500
- Subject: Re: Failure to dlopen libgomp due to static TLS data
- Authentication-results: sourceware.org; auth=none
- References: <201502121519 dot t1CFJMAe018776 at d03av02 dot boulder dot ibm dot com> <20150212160959 dot GS23507 at brightrain dot aerifal dot cx> <20150212161145 dot GD1746 at tucnak dot redhat dot com> <20150212161617 dot GU23507 at brightrain dot aerifal dot cx> <54DCEF90 dot 6090700 at redhat dot com> <20150212232756 dot GZ23507 at brightrain dot aerifal dot cx> <54DDC009 dot 3030505 at redhat dot com>
On Fri, Feb 13, 2015 at 09:12:41AM +0000, Andrew Haley wrote:
> On 12/02/15 23:27, Rich Felker wrote:
> > On Thu, Feb 12, 2015 at 06:23:12PM +0000, Andrew Haley wrote:
> >> On 02/12/2015 04:16 PM, Rich Felker wrote:
> >>> On Thu, Feb 12, 2015 at 05:11:45PM +0100, Jakub Jelinek wrote:
> >>>> On Thu, Feb 12, 2015 at 11:09:59AM -0500, Rich Felker wrote:
> >>>>> This usage is supposed to be deprecated. Why isn't libgomp using
> >>>>> TLSDESC/gnu2 model?
> >>>> Because it is significantly slower.
> >>> Seems very unlikely. If storage is allocated in static TLS, TLSDESC is
> >>> almost indistinguishable from IE in performance, even when you run
> >>> artificial benchmarks that do nothing but hammer TLS access. When it
> >>> gets allocated in dynamic TLS, it's somewhat slower, but still
> >>> unlikely to matter for most usage IMO.
> >> The problem I'm seeing is that dynamic TLS is always used even when not
> >> necessary, and that hurts Java (which accesses TLS 128k times in the first
> >> 500ms or so of execution). According to lxo his patch fixes that.
> > Given those numbers, each access would need to be taking 38ns to
> > consume even 1% of the cpu time being spent. I would guess accesses
> > are closer to 5ns for TLSDESC in static area and 10-15ns for dynamic.
> > So I don't think this is a botteneck.
> I'm totally unconvinced by this style of argument. An efficient system
> is composed of many small optimizations, each apparently insignificant
> in itself. Your figures indicate that this slowdown may be about 0.5%.
> 0.5% is not small. I put in a lot of work to gain 0.5%.
It seems misguided to try to save 0.5% of a 500ms startup time by
choosing a hackish TLS model that's going to break for some people,
when the elephant in the room is java. You could make all 500ms
(except maybe 500us) go away by using a proper language/runtime and
get a 99.9% savings.