This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: glibc 2.23 --- Starting soft/slush freeze
- From: Torvald Riegel <triegel at redhat dot com>
- To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- Cc: GNU C Library <libc-alpha at sourceware dot org>, "Carlos O'Donell" <codonell at redhat dot com>, David Miller <davem at davemloft dot net>
- Date: Tue, 12 Jan 2016 21:47:07 +0100
- Subject: Re: glibc 2.23 --- Starting soft/slush freeze
- Authentication-results: sourceware.org; auth=none
- References: <5694161C dot 1010200 at linaro dot org> <1452601558 dot 26597 dot 282 dot camel at localhost dot localdomain> <5694F34D dot 8070301 at linaro dot org> <1452610148 dot 26597 dot 309 dot camel at localhost dot localdomain> <56954792 dot 2030601 at linaro dot org>
On Tue, 2016-01-12 at 16:36 -0200, Adhemerval Zanella wrote:
>
> On 12-01-2016 12:49, Torvald Riegel wrote:
> > On Tue, 2016-01-12 at 10:36 -0200, Adhemerval Zanella wrote:
> >>
> >> On 12-01-2016 10:25, Torvald Riegel wrote:
> >> The only bit I am not sure is the sparc stated by you it might
> >> broke on newer implementation. Do you think we can get sparc parts done
> >> and tested by the end of the week?
> >
> > The only thing that makes sparc different is that pre-v9 sparc32 has no
> > real support for atomics. The atomic operations we have are supported,
> > but CAS or other read-modify-write operations are only atomic within a
> > process (the hardware only has a test-and-set, so things need to be
> > lock-based essentially).
> >
> > For all but pre-v9 sparc, I'd just use the new barrier implementation.
> > For pre-v9 sparc, I could try to hack up a version that uses an internal
> > lock for all the read-modify-write atomic ops used by the new barrier
> > implementation. But I can't test this.
> >
> > Dave, could you take care of adapting to pre-v9 sparc?
>
> Right, I also do not feel compelled on delay this feature due a very
> specific architecture lacking of support. I recall Dave saying he only
> build and test sparc near release date, so it will give him roughly a
> month to either help adjust the code to pre-v9 sparc or help you with
> testing. Any objections?
No objections from me. What would you prefer for the patch?:
(1) Remove all existing sparc code. This effectively breaks pre-v9
process-shared barriers in the sense of not working at runtime.
(2) Don't touch sparc at all. I believe this breaks builds on sparc.
(3) Try the approach outlined below but leave any testing to Dave.
> >
> > It also feels like we might need a better process to handle such pre-v9
> > sparc issues. It comes up whenever we change concurrent code that is
> > supposed to work in a process-shared setting (it will come up for
> > condvars again). Any suggestions?
> >
> > I think one relatively clean way to support pre-v9 sparc on
> > process-shared was if we could set/unset a lock that is to be used for
> > subsequently executed atomic operations. The set/unset would be a noop
> > for all but pre-v9 sparc. On pre-v9 sparc, we could put this address
> > into a thread-local variable, and have the atomic RMW ops check whether
> > such a lock is set or not; if it is, it will be used instead of one of
> > the 64 per-process locks used for non-process-shared atomics.
> > This way, we probably wouldn't need custom pre-v9 sparc code anymore.
> > Performance may be somewhat slower, but it will be slower anyway due to
> > having to use a lock in the first place instead of using a comparatively
> > light-weight native atomic operation.
> > Thoughts?
>
> This seems a reasonable approach, specially the part to get rid of custom
> pre-v9 sparc code.
OK.
> I would also aim first on conformance and code simplicity
> and thus later try on optimize this locking mechanism, if possible.
Agreed.
> >
> >>>
> >>> Also, what about the list of desirable features? I had assumed they are
> >>> candidates for late commits too but will, unlike blockers, not block the
> >>> release.
> >>>
> >>
> >> I see only the new barrier and condvar implementation as the desirable
> >> features on this release. From your previous messages I see that condvar
> >> is still showing some issues with your internal tests, so I take we might
> >> delay to 2.24 (correct me if I am wrong).
> >
> > I'm behind schedule on the condvar fix, but I could try to finish it
> > this week too. Would this make the new condvar acceptable for 2.23?
> > I don't want to rush this, but I worry about delaying to 2.24, still not
> > getting any reviews, and being in the same position when the 2.24
> > release comes around.
> > Are there volunteers who would be willing and able to review the new
> > condvar provided the fix gets posted upstream end of this week?
> >
>
> How confident are you about the condvar implementation? I recall you saying
> you tested on Fedore Rawhide, but it leads to some issues. I would prefer
> to not rush on 2.23 and work to get it done subsequent months after master
> is open again (so we have plenty time of fix until next releases).
I identified the issue a while ago, and I think I understand it well.
It was an oversight on my part: I didn't consider that spurious futex
wake-ups can happen, which can, under certain specific uses of a
condvar, lead to signals getting lost. In the original implementation,
I had already implemented a countermeasure for essentially the same
situation that can happen on cancellation, but the performance cost are
too high to make this practical for the common case.
I have a conceptual fix for it, which adds some complexity. I'll try to
implement it this week, and then assess again how confident I am about
it.