This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Principles for syscall wrappers, again


On Thu, 2015-05-28 at 16:51 +0000, Joseph Myers wrote:
> On Tue, 26 May 2015, Rich Felker wrote:
> 
> > > Thus, it seems to me we should explicitly discuss the GNU API we want to
> > > expose for each syscall's functionality that we want to offer, and not
> > > just expose syscalls interfaces as-is by default.
> > 
> > I think this is gratuitous NIH'ing and a disservice to applications.
> > Code which wants to use the futex API is already doing so via
> > syscall() with the existing API, and is most easily updated to use
> > futex(). Your proposed API is not any more general or easier to
> > provide on NaCl or elsewhere; the existing API can easily be provided
> > because it's equivalent.
> 
> Agreed.  As I said in 
> <https://sourceware.org/ml/libc-alpha/2015-05/msg00764.html>, I think the 
> API details to discuss are things such as the userspace types and the 
> header with the declaration, not any more substantial rearrangement of the 
> API - treating Linux as an API source like BSD and SysV, except for API 
> details outside the scope of what the kernel defines.

Joseph, the fact that you comment on futex specifically tells me you're
okay with discussing futex in detail in this thread (which I tried to
avoid so far).

I suppose both you and Rich are thinking about this as the futex() API
you want to expose:

int futex(int *uaddr, int op, int val, const struct timespec *timeout,
                 int *uaddr2, int val3);


syscall() is multiplexing.  And so is futex().  Thus, by your logic,
there's no need for offering futex() because syscall() is doing it
already, just with *one part* of the multiplexing being variable.

There is no API-design reason for the futex syscall to do multiplexing
itself.  Look at the manpages -- the documentation clearly speaks about
logically separate operations that are exposed through this single
syscall.  We do not need this multiplexing at all, or do we need to
minimize the number of functions for some significant reason?  Why not
just have a file() function then that has as first param an enum
signaling whether one actually wants to read or write from a file?

Thus, the specific futex() signature is an artifact of having to
multiplex.  We don't have to multiplex, so this isn't a reason to keep
it.

Keeping the multiplexing is bad for users.  Can you tell me off-hand
what goes in "uaddr2", "val", or "val3" for all the ops?  Is it easy to
remember based on the function signature?
Can you remember in which cases "timeout" is actually "val2" and not a
pointer but cast to uint32_t?  So are we going to expect users to cast
uint32_t's to a pointer to call one of the operations and consider that
a useful API design?  It's a nice way to potentially trigger compiler
warnings though.


In contrast, consider an API such as this (just for exposure, we should
discuss the details such as whether we want to break split out the
flags, or split out waiting with timeouts):

int futex_wake(int* futex_word, int flags, int max_wakeups);
int futex_wait(int* futex_word, int flags, int expected_value,
  const struct timespec *timeout);
int futex_cmp_requeue(int* futex_word, int flags, int expected_value,
  int max_wakeups, int max_requeue, int* requeue_futex_word);
...

This makes it easy for users to use the right parameters at the right
positions at call sites.  Things like code completion will just show the
function signatures, or you look it up, and you're fine.  Parameters can
be used in a type-safe way too, so no odd complaints by your compiler.

And this is *not* an API redesign or anything like that.  If you think
this is NIH, go look at the draft of the futex man-pages:
http://git.kernel.org/cgit/docs/man-pages/man-pages.git/tree/?h=draft_futex
We are just exposing the logical API documented there.

Terms such as "futex word" come from the kernel's documentation.  We do
pick a position for the flags parameter, for example, but this isn't
covered by the kernel interface because it wants to multiplex.  But the
rest of the parameters are all just taken from the kernel interface
unmodified.  What's the point of working on clear futex documentation if
we then unnecessarily obfuscate it through exposing a multiplexing
function?

Thus, not exposing a clean form of the logical API documented by the
kernel and instead exposing a multiplexed form for no good reason would
be the disservice to users.

Rich, whether a transition from syscall() to futex() would be easier is
irrelevant because doing that doesn't buy you anything significant in
terms of easy of use, type-safe code, or similar.  We need to think
about what is best for users that want to use futexes, period -- not
what is best for users that already do so using syscall().

If we expose futex() and thus think that this is the interface people
should be using, I suppose you all are fine with using futex() as
glibc-internal interface too, right?  If not, surely neither will be the
case for external users.  We should not expose interfaces that force
users to use their own wrappers to make sense of it, especially not if
we ourselves are already aware of the need for such wrappers.


Thus, I do think exposing futex() with the multiplexing in place would
be a *poor* API to expose to users, and a clear mistake.  I do not know
whether other syscall interfaces show similar issues, but the futex case
shows to me that clearly we need to consider and discuss what APIs we
actually expose.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]