This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
New rwlock committed -- enabling elision for it needs better default tuning first
- From: Torvald Riegel <triegel at redhat dot com>
- To: GLIBC Devel <libc-alpha at sourceware dot org>
- Cc: Siddhesh Poyarekar <siddhesh at sourceware dot org>, "Kleen, Andi" <andi dot kleen at intel dot com>, Stefan Liebler <stli at linux dot vnet dot ibm dot com>, Tulio Magno Quites Machado Filho <tuliom at linux dot vnet dot ibm dot com>, "Senkevich, Andrew" <andrew dot senkevich at intel dot com>
- Date: Tue, 10 Jan 2017 12:33:40 +0100
- Subject: New rwlock committed -- enabling elision for it needs better default tuning first
- Authentication-results: sourceware.org; auth=none
I have committed the new rwlock. Thanks a lot to Carlos for reviewing
this.
A note to maintainers of architectures that come with HTMs: Lock elision
is currently not enabled for the new rwlock. It should be fairly easy
to do that, but I think we need to be more careful in how we select the
default tuning parameters; if we make any conscious trade-offs or
assumptions there, these have to be documented internally and for our
users, and we should have consensus in glibc for them. (One could argue
that arch maintainers have a stronger vote regarding the choice of
tuning parameters, but this does not mean that the reasons for the
choice do not need to be documented.)
In my opinion, the general goal for lock elision should be that it
should make some workloads faster while not decreasing performance
significantly for all other workloads. Every deviation from that needs
to be a consciously made trade-off that is clearly documented.
When testing on x86_64 TSX, my first guess for a workload that may test
the quality of the adaption code and the default tuning parameters
turned out to be a case where elision failed and produces significantly
worse performance than the base algorithm (in particular, worse
scalability). All this did was add transaction conflicts in readers;
this isn't an artificial special case, but can easily happen through
things like false sharing.
Quickly observing problems like this does not exactly make one confident
in the quality of the adaption code and the default tuning parameters,
although it is just one sample so far. Therefore, I'd like us to be
more thorough in how we deal with lock elision and the performance
characteristics of it.
I'd like to start with gathering why arch maintainers that added,
enabled, or ack'ed elision chose the tuning parameters we currently have
for exclusive locks or had for the old rwlock.
* Which assumptions did you make about workloads?
* Which benchmarks did you run?
* Which properties of the current implementations of your HTM features
are critical (and about which of these do you make assumptions that
affect the choice of tuning parameters)?
Next, I'd like us to model these assumptions and trade-offs with
microbenchmarks, so that we can check these for regressions and changes
across glibc releases. I'll post the microbenchmark that I've been
using for the rwlock soon, and will CC you and request your input.