This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH 1/2] Add futex wrappers with error checking
- From: Roland McGrath <roland at hack dot frob dot com>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: GLIBC Devel <libc-alpha at sourceware dot org>
- Date: Thu, 11 Dec 2014 17:18:40 -0800 (PST)
- Subject: Re: [PATCH 1/2] Add futex wrappers with error checking
- Authentication-results: sourceware.org; auth=none
- References: <1417726487 dot 22797 dot 48 dot camel at triegel dot csb> <20141205003310 dot 3A8B52C3A73 at topped-with-meat dot com> <1417804656 dot 22797 dot 107 dot camel at triegel dot csb>
> The second is that I haven't looked through all the lowlevellock cases
> yet, so didn't want to touch that; it seemed moving lll_futex* callers
> over to futex* callers wouldn't be an issue later on.
Fair enough. I'm happy as long as the end state has no duplication of
code/logic (one internal futex interface to rule them all) and you commit
to driving all the corners of the cleanup so we get to that end state in
> Third, some of the lll_futex* definitions are in headers that are also
> used from asm files; I guess that would mean I'd need to use macros
> instead of C functions.
#ifdef __ASSEMBLY__ if need be. But we can also just clean things up so
that's no longer the case. There's no reason I can see why assembly code
should want lowlevellock-futex.h.
> Fourth, I need some way to get to the arch-specific futex syscalls. I
> didn't know whether sysdeps/unix/sysv/linux/lowlevellock-futex.h would
> work on every arch, so I just used what works for the locks.
What's an arch-specific futex syscall? AFAIK
sysdeps/unix/sysv/linux/lowlevellock-futex.h should be fine for all
machines. Indeed sysdeps/unix/sysv/linux/lowlevellock.h should be fine for
all machines too, and the only reason we still have any machine-specific
files is conservatism about making sure that removing each one doesn't
degrade any performance or semantics (so someone just needs to look at the
generated code and compare for each machine).
> > I don't
> > think we want to have both layers as such in the long run,
> Maybe not. If we want to expose our own futex abstraction to users,
> we'd need a separate version that does less of the error checking we do,
> as there may be cases where certain errors would need to be handled
> differently. You point out something similar below; checking that the
> kernel (or whatever below provides the futex functionality) didn't
> return errors we haven't specified in our futex abstraction.
I think the best approach for now is not to think about any new user API.
Just do all the cleanup of our internal futex use thoroughly so we think
it's very good and very maintainable. When/if we come up with a new user
API later, we can refactor as needed to implement it.
> I didn't think about clean-up as much. What I wanted is something we
> can use today to get the futex error handling correct in pthread_once
> and the the semaphores I'm about to submit, for example.
I'm going to insist on cleanup so we aren't growing redundant internal
> Fine with me. From my perspective, it seemed best to start with an
> abstraction with well-defined semantics (that's what I tried to do in
> futex-internal.h at least), so that all clients of it are taken care of.
> Any clean-up under the hood of it could be then done independently.
> I think I have a pretty good understanding for what the futex semantics
> of the abstraction that we use internally should be. I don't have a
> good feel for how to best clean up all the existing code we have related
> to that.
If you start with a strawman proposal for the complete new internal API,
then we can all work together to figure out how to clean up existing code.
> I'm not sure we actually need something for pthread_barrier_wait. All
> uses of the private field are xor'd with FUTEX_PRIVATE_FLAG. It seemed
> to me that just having a properly set up value for the private field in
> the first place would suffice.
That might well be what makes most sense with all this cleaned up. I can
only barely guess what motivated the arrangement with inverting the bit.
> Interesting. I haven't thought about the case where shared is not
I think that's the main wrinkle that introduced all this complexity.
But it's all so fuzzy to me that this is only a guess.
> I suppose "dynamic" would still mean that this is stable throughout the
> lifetime of the process?
Correct. It's just runtime detection of what the kernel you're running on
supports, like we do for various syscalls and flags. For private futex
support, the existing code does a query at startup (nptl-init.c).
> > I'm sure this can be done in a way that does not change the compiled
> > code at all for Linux.
> Probably, but personally I wouldn't worry about that. When we do the
> transition to the new internal futex API, we'd change code anyway
> because of adding more error checking and such.
I really just meant that dynamically using the private flag or not should
be no more costly than the current code's XOR tricks.
> I also though about int vs. unsigned int for a while. The Linux kernel
> has int, but when I look at the synchronization code I'm using futexes
> for, in most of the cases it's an unsigned you want to work with.
> Therefore, I picked unsigned.
Seems sensible. We just may need some cleanups once everything is going
through type-checking inlines.
> Nothing needs to be volatile there, IMO. If anything, this should be an
> atomic type.
OK. The NaCl futex facility has a prototype using 'volatile int *', but
that might not have been a sensible choice.
> > > * nptl/futex-internal.h: New file.
> > Just as a procedural matter, I'm inclined to say that a new file like
> > this should come in the same commit as the first use of it.
> Okay. Do you want me to merge the two patches?
That seems right.
> > > +#include <lowlevellock.h>
> > Include only what you need: lowlevellock-futex.h here. That changes
> > which code you're getting today, because all the machine-specific
> > lowlevellock.h files still need to be removed. But we should be
> > finishing that cleanup this cycle anyway (though everyone seems to
> > have forgotten).
> I tried that now, but that doesn't work because it redefines lll_futex*,
> and it's hard to avoid including lowlevellock.h through some other
> header. Therefore, I left this unchanged for now.
OK. Perhaps you'd like to take on eliminating at least the x86 versions of
lowlevellock.h? (I think we'll really need to eliminate all of them before
all futex-related cleanup is done.)
> > So, now I'm seeing a potential reason to have this layer exist
> > distinct from the OS-encapsulation layer. Perhaps we should have the
> > checks for expected errno values be in an OS-independent layer rather
> > than just saying in the specification of the OS-encapsulation layer
> > that it must yield only the particular set.
> I'm not sure I can quite follow you. I could see why the
> OS-encapsulation layer would want to check that the set of return values
> is only those we support in higher layers, but that's not what you're
> after, or is it?
If the generic (higher) layers require a certain protocol about errno
values and we want code to enforce/sanity-check the underlying OS calls for
that (which seems to be the consensus for Linux), then it is duplicative
and error-prone for each OS-specific layer to repeat that checking logic.
> Updated patch is attached. Is this one okay, or do you want to see
> further changes to it and/or more of the full problem being addressed?
I guess I'd like to be closer to a full plan for cleaning it all up--that
is, at least a more full sense of what the complete end state will look
like, if not all the details about how to reach it--before we start