This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH 1/2] Add futex wrappers with error checking
- From: Roland McGrath <roland at hack dot frob dot com>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: GLIBC Devel <libc-alpha at sourceware dot org>
- Date: Wed, 17 Dec 2014 15:34:41 -0800 (PST)
- Subject: Re: [PATCH 1/2] Add futex wrappers with error checking
- Authentication-results: sourceware.org; auth=none
- References: <1417726487 dot 22797 dot 48 dot camel at triegel dot csb> <20141205003310 dot 3A8B52C3A73 at topped-with-meat dot com> <1417804656 dot 22797 dot 107 dot camel at triegel dot csb> <20141212011840 dot 2B99A2C3ADB at topped-with-meat dot com> <1418774081 dot 7165 dot 71 dot camel at triegel dot csb> <20141217010728 dot 13CEC2C2446 at topped-with-meat dot com> <1418835610 dot 25358 dot 24 dot camel at triegel dot csb>
> It is. I do want to have the low-level locks check the return values of
> their futex calls. To me, that doesn't conflict with having another
> interface that is just what the underlying kernels/... provide. We can
> also get rid of this other interface and do error checking in each of
> the per kernel/... implementations. Does that clarify what I meant?
Yes, that makes sense. On further reflection, I think it does make sense
to have the error-checking layer be outside the OS-encapsulation layer so
that we have foolproof consistency of the expected error protocol of the OS
> The message Carlos sent out states otherwise. I believed we do want to
> have fixed release dates, and I assumed that the release date would be
> Dec 30. The Jan 9 date Carlos mentioned gives a bit more time, but not
Indeed. I don't particularly advocate the fixed date approach, but I
haven't objected to it, and won't.
> What would this mean for the new semaphore I posted? Are you okay with
> it using the current lll_futex_* operations and custom error checking?
If that is fixing some real, observable bug independent of the futex error
handling question, then I think it's fine as an intermediate step to do
now. If it is nothing but internal reorganization or the only kind of bug
it's fixing is the lack of futex error handling that is a more pervasive
problem (and there's nothing really notable about the semaphore instance of
that problem)--or is a purely speculative theoretical bug--then I don't
think it's worth perturbing things before the release.
> I would also hope that we can remove all of them. But actually doing
> that may be a bigger project. The futex-using assembly files I see are
> on x86, x86_64, and sh. Its the low-level lock stuff (including
> robust), pthread barrier, pthread condvar, pthread rwlock, pthread
> semaphore, cancellation.
The 6-argument syscall issue on i386 is the only blocker we know of, right?
But indeed we wanted to be conservative about perturbing performance issues
for i386/x86_64, and just establishing confidence about that will take some
time. I had hoped someone like you would have done that during this cycle,
but since actual focus on it is only beginning about now, it might well
need to wait at least until right after the release.
> Cancellation may be something that ends up doing futex calls from
Can you point to particular code you're thinking of?
> One thing we need to agree on is whether we need this interface in the
> end at all, or whether we make the interface from 2) (see below) the one
> that each OS implements. What do you think?
On the one hand I want to avoid extra layers when possible. On the other
hand, I am adamant about having any sanity-check enforcement of error
protocols be done in code that is OS-independent so that we do not risk
skew between OS-specific implementations of the sanity checks.
Perhaps that right balance is to have some common code that does the error
protocol assertions for each piece of the internal interface, but as
utility code that the OS-specific layer calls rather than as an extra layer
around it. Either of those two approaches should be fairly easy to
refactor into the other later.
> Keeping 1) would make it easier for us to expose it to users in the
> future (I do remember you not wanting to care about this now, though).
> Keeping 1) isn't helping for internal use if the errors that it can
> return are incompatible between the different OS implementations.
Right. As I said before, I want to ignore speculation about a new user API
if considering it would complicate the ideal internal cleanup we can do in
its absence. Off hand,
> What is NaCl's error specification for the futex ops? Given that much
> of this refactoring is to make non-Linux futex possible, it would be
> helpful if you can summarize what you need and want.
It's not formally specified. Currently it's quite simple: only basic wake
and wait/timed-wait are supported, and the only errors possible are EFAULT
and EAGAIN (just for value mismatch). It's likely that it will grow over
time to cover a larger subset of the Linux features and their nontrivial
failure modes. It will always be the intent that it align pretty well with
the Linux semantics.
> > > 5) Remove custom low-level lock implementations after reviewing the
> > > performance implications of such removals.
> > This need not wait for any other step. Aside from the i386 issue, this can
> > be done today and IMHO the sooner it's done the better.
> That's true, but in contrast to the other steps, this may require more
> time to assess the impact on performance.