This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC] pthread support for FUTEX_WAIT_MULTIPLE
[replying to both inline below, let me know if just branching the thread
is generally preferable in the future]
On 7/31/19 1:14 AM, Florian Weimer wrote:> * Pierre-Loup A. Griffais:
The gist of it is that it adds a new function,
pthread_mutex_timedlock_any(), that can take a list of mutexes. It
returns when one of the mutex has been locked (and tells you which
one), or if the timeout is met. We would use this to reduce
unnecessary wakeups and spinning in our thread pool
synchronization code, compared to the current eventfd hack we rely
on.
This explains why the mutexes have to be contiguous in memory. For
other applications, this looks like an unnecessary restriction.
Agreed. Will switch the first argument to an array of pointers instead.
- I assume whichever name it might end up as should end with _np?
Is there any specific process to follow for this sort of
non-standard inclusion, other than just writing the code and
documentation?
It seems unlikely that this is ever going to be standardized, so I
think we'd need the _np suffix, yes.
Thanks for clarifying.
- What is the standard way for an application to discover whether
it can use an entrypoint dependent on a certain Linux kernel
version? With our proposed use, we'd be fine running the function
once at startup to pick which path to chose, eg.
pthread_mutex_lock_any( NULL, 0, NULL, NULL ). If it returns 0,
we'd enable the new path, otherwise we'd fall back to eventfd(). I
have a TODO in the code where we could do that, but it's probably
not the right way to do things.
I think you would have to probe on first use inside
pthread_mutex_lock_any, using a dummy call.
OK, that was my original plan, I can finish writing that up.
- I assume the way I'm exposing it as a 2.2.5-versioned symbol for
local testing is wrong; what is the right way to do this?
This patch could be targeted at glibc 2.31, then you would have to
use GLIBC_2.31.
- In my tree I have a placeholder test application that should be
replaced by a new mutex test. However, it would also be a good idea
to leverage other mutex tests to test this new thing, since it's a
superset of many other mutex calls. Could we make the normal mutex
test suite run a second time, with a macro wrapping the normal
pthread_lock with this implementation instead?
Due to the new ENOMEM error, the new function is not a strict
superset.
True, thanks for pointing that out. Any sort of attempted wrapping would
have to account for that somehow.
I'm wondering if the current design is really the right one,
particularly for thread pools. The implementation performs multiple
scans of the mutex lists, which look rather costly for large pools.
That's probably unavoidable if the list is dynamic and potentially
different for every call, but given the array-based interface, I
don't think this is the intended use. Something that use
pre-registration of the participating futexes could avoid that. I
also find it difficult to believe that this approach beats something
that involves queues, where a worker thread that becomes available
identifies itself directly to a submitter thread, or can directly
consume the submitted work item.
That might be interesting if walking the lists turns out to be a
hotspot. I think the mutex count per operation would typically not be
thousands, and probably not hundreds either.
I would think there's still a queue somewhere to acquire jobs, this
would be used before and after. For instance, job threads want to sleep
until work has been queued, or another system event occurs that might
require them to wake up, like app shutdown or scene transition.
Similarly, after firing off N jobs, the job manager will want to sleep
until one of the jobs is complete to perform some accounting and publish
the results to other systems. For both of these usecases, using eventfd
to wait for multiple events seems to result in more CPU spinning than
the futex-based solution, both in userspace and the kernel.
Thanks, Florian
On 7/31/19 3:01 AM, Szabolcs Nagy wrote:
On 31/07/2019 01:07, Pierre-Loup A. Griffais wrote:
I started putting together a patch to expose the new Linux futex
functionality that recently got proposed for upstream inclusion.
[1]
...
[1] https://lkml.org/lkml/2019/7/30/1399
i don't see that patch on the linux-api list where userspace api
related patches are discussed.
syscalls that have time argument need extra care now that 32bit
targets will get a new 64bit time_t abi.
Thanks, looks like there were compat concerns raised on the kernel side
as well; we can copy linux-api for the next patch iteration.
the futex syscall is multiplexed and intricately related to the
pthread implementation so there are many reasons why such patch
should not be accepted into linux before agreement with userspace.
What does that process typically look like, other than raising it on
both ends like we did?
Thanks for all the feedback!
- Pierre-Loup