This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH roland/vdso_clock_gettime] x86: Clean up __vdso_clock_gettime variable.
- From: Torvald Riegel <triegel at redhat dot com>
- To: Roland McGrath <roland at hack dot frob dot com>
- Cc: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>, libc-alpha at sourceware dot org
- Date: Fri, 30 Jan 2015 10:57:36 +0100
- Subject: Re: [PATCH roland/vdso_clock_gettime] x86: Clean up __vdso_clock_gettime variable.
- Authentication-results: sourceware.org; auth=none
- References: <20150129005747 dot 3DAE82C3A92 at topped-with-meat dot com> <54CA1705 dot 80903 at linux dot vnet dot ibm dot com> <20150130002625 dot 93E832C3A08 at topped-with-meat dot com>
On Thu, 2015-01-29 at 16:26 -0800, Roland McGrath wrote:
> > Since you are modifying the code, which are the advantages and constrains to
> > x86_64/i386 carry specific assembly implementation for pthread_cond_*
> > function? What prevent them to use the C default code?
>
> We've discussed before that we'd like to eliminate the assembly. We have
> just been too conservative to do it without careful performance testing to
> give us confidence that it won't be a performance regression, and nobody
> has done that yet. I think it might be on Torvald's agenda. (If not, we
> need a volunteer for that!)
IMO, the condvar assembly has to go because we need to change the
synchronization algorithm to correctly implement the stronger ordering
requirements that POSIX and C++ have clarified they want. I don't think
anybody will volunteer to write an assembly implementation with the new
algorithm.
Wrt performance, I would guess not having assembly is not critical.
Unless a, say, semaphore, there's no real fast-path in a condvar; you
really want to wait until you get a notification after you started
waiting. So once you do that, there will be cache misses to some extent
because somebody else is waking you up; thus, the communication cost
will matter most, not any per-thread instruction count differences.
I have the new condvar implementation working except failures on
tst-cond25, and except PI bits (but there's not a lot we can do there
anyway compared to the current implementation). I first thought this
would be a PI-related issue, but now I'd guess it's rather cancellation
related. I don't yet know the root cause. But I see the same test fail
on i686 with the old implementation too.