This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH v2] Single threaded stdio optimization
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: "triegel at redhat dot com" <triegel at redhat dot com>, Szabolcs Nagy <Szabolcs dot Nagy at arm dot com>
- Cc: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, nd <nd at arm dot com>
- Date: Fri, 30 Jun 2017 15:34:05 +0000
- Subject: Re: [PATCH v2] Single threaded stdio optimization
- Authentication-results: sourceware.org; auth=none
- Authentication-results: sourceware.org; dkim=none (message not signed) header.d=none;sourceware.org; dmarc=none action=none header.from=arm.com;
- Nodisclaimer: True
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
> What's interesting here is that your high-level optimization is faster
> than doing the single-thread check in the low-level lock (x86 has it
> already in the low-level lock).
Have you ever looked at the generated code for eg. getc?
Each lock does a lot of work even with the low level lock bypass
optimization. It still does several branches, reads and writes, and
this is repeated twice for the lock and unlock. A single branch bypassing
all that is obviously going to be much faster...
And interestingly when you remove the low level lock optimization,
multithreaded code will run faster too as it no longer needs to do the
extra checks for the single-threaded case.