[PATCH v2] rwlock: Fix explicit hand-over.
Torvald Riegel
triegel@redhat.com
Thu Apr 6 10:51:00 GMT 2017
On Mon, 2017-03-27 at 15:10 -0400, Waiman Long wrote:
> On 03/27/2017 02:59 PM, Waiman Long wrote:
> > On 03/27/2017 02:16 PM, Waiman Long wrote:
> >> On 03/27/2017 01:53 PM, Torvald Riegel wrote:
> >>> On Mon, 2017-03-27 at 12:09 -0400, Waiman Long wrote:
> >>>> On 03/25/2017 07:01 PM, Torvald Riegel wrote:
> >>>>> On Sat, 2017-03-25 at 21:17 +0100, Florian Weimer wrote:
> >>>>>> * Torvald Riegel:
> >>>>>>
> >>>>>>> + bool registered_while_in_write_phase = false;
> >>>>>>> if (__glibc_likely ((r & PTHREAD_RWLOCK_WRPHASE) == 0))
> >>>>>>> return 0;
> >>>>>>> + else
> >>>>>>> + registered_while_in_write_phase = true;
> >>>>>> Sorry, this doesn't look quite right. Isn't
> >>>>>> registered_while_in_write_phase always true?
> >>>>> Attached is a v2 patch. It's the same logic, but bigger. Most of this
> >>>>> increase is due to reformatting, but I also adapted some of the
> >>>>> comments.
> >>>>> I get two failures, but I guess these are either due to the bad internet
> >>>>> connectivity I currently have, or something at the resolver.
> >>>>> FAIL: resolv/mtrace-tst-leaks
> >>>>> FAIL: resolv/tst-leaks
> >>>>>
> >>>>>
> >>>> I have verified that the v2 patch did fix the hang that I saw with my
> >>>> microbenchmark. I also observed an increase in performance in the new
> >>>> rwlock code compared with the old one before the major rewrite.
> >>> Thanks!
> >>>
> >>>> On a
> >>>> 4-socket 40-core 80-thread system, 80 parallel locking threads had an
> >>>> average per-thread throughput of 32,584 ops/s. The old rwlock code had a
> >>>> throughput of 13,411 only. So there is a more than 1.4X increase in
> >>>> performance.
> >>> Is that with the 50% reads / 50% writes workload (per thread), empty
> >>> critical sections, and no delay between critical sections?
> >>>
> >> Yes, I used the default configuration of 1:1 read/write ratio. The
> >> critical section isn't exactly empty as I used 1 pause instruction for
> >> both in the critical section and between critical section.
> >>
> >> Regards,
> >> Longman
> >>
> > Just found out that there is a regression in performance when in writer
> > preferring mode. The average per-thread throughput was 4,733 ops/s with
> > the old glibc, but 2,300 ops/s with the new implementation vs 32,584
> > ops/s for the reader-preferring mode. It was said in the code that
> > writer-preferring mode isn't the focus in the rewrite. So I am not
> > saying that it is bad, but it is something to keep in mind about.
> >
> > Regards,
> > Longman
>
> Another issue that I saw was lock starvation. IOW, a continuous stream
> of readers will block writers from acquiring the lock in
> reader-preferring mode. In writer-preferring mode, a continuous stream
> of writers will block readers from the acquiring the lock. You can see
> that by using the -x option in my rwlock benchmark which will force the
> separation of reader and writer threads instead of every thread doing
> both reader and writer locks.
Yes, this is expected. Whether that's a good thing or a bad thing
depends on what the user wants (e.g., if a rwlock is used to separate a
garbage collection process (writer) and normal operations (readers),
then maybe the user really wants to run GC just when there's nothing
else going on). The modes we have currently strictly prefer readers or
writers; we have no modes for mostly preferring readers/writers while
also preventing starvation (for which we'd need to choose some latency
number). OTOH, one could also argue that preferring readers/writers
should rather be the latter.
More information about the Libc-alpha
mailing list