This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] pthread support for FUTEX_WAIT_MULTIPLE


[replying to both inline below, let me know if just branching the thread
is generally preferable in the future]

On 7/31/19 1:14 AM, Florian Weimer wrote:> * Pierre-Loup A. Griffais:

The gist of it is that it adds a new function, pthread_mutex_timedlock_any(), that can take a list of mutexes. It returns when one of the mutex has been locked (and tells you which one), or if the timeout is met. We would use this to reduce unnecessary wakeups and spinning in our thread pool
synchronization code, compared to the current eventfd hack we rely
on.

This explains why the mutexes have to be contiguous in memory. For other applications, this looks like an unnecessary restriction.

Agreed. Will switch the first argument to an array of pointers instead.


- I assume whichever name it might end up as should end with _np?
Is there any specific process to follow for this sort of
non-standard inclusion, other than just writing the code and
documentation?

It seems unlikely that this is ever going to be standardized, so I
think we'd need the _np suffix, yes.

Thanks for clarifying.


- What is the standard way for an application to discover whether
it can use an entrypoint dependent on a certain Linux kernel
version? With our proposed use, we'd be fine running the function
once at startup to pick which path to chose, eg.
pthread_mutex_lock_any( NULL, 0, NULL, NULL ). If it returns 0,
we'd enable the new path, otherwise we'd fall back to eventfd(). I
have a TODO in the code where we could do that, but it's probably
not the right way to do things.

I think you would have to probe on first use inside pthread_mutex_lock_any, using a dummy call.

OK, that was my original plan, I can finish writing that up.


- I assume the way I'm exposing it as a 2.2.5-versioned symbol for local testing is wrong; what is the right way to do this?

This patch could be targeted at glibc 2.31, then you would have to
use GLIBC_2.31.

- In my tree I have a placeholder test application that should be replaced by a new mutex test. However, it would also be a good idea to leverage other mutex tests to test this new thing, since it's a superset of many other mutex calls. Could we make the normal mutex test suite run a second time, with a macro wrapping the normal pthread_lock with this implementation instead?

Due to the new ENOMEM error, the new function is not a strict
superset.

True, thanks for pointing that out. Any sort of attempted wrapping would
have to account for that somehow.


I'm wondering if the current design is really the right one, particularly for thread pools. The implementation performs multiple scans of the mutex lists, which look rather costly for large pools. That's probably unavoidable if the list is dynamic and potentially different for every call, but given the array-based interface, I
don't think this is the intended use.  Something that use
pre-registration of the participating futexes could avoid that.  I
also find it difficult to believe that this approach beats something
that involves queues, where a worker thread that becomes available
identifies itself directly to a submitter thread, or can directly
consume the submitted work item.

That might be interesting if walking the lists turns out to be a
hotspot. I think the mutex count per operation would typically not be thousands, and probably not hundreds either.

I would think there's still a queue somewhere to acquire jobs, this would be used before and after. For instance, job threads want to sleep until work has been queued, or another system event occurs that might require them to wake up, like app shutdown or scene transition. Similarly, after firing off N jobs, the job manager will want to sleep until one of the jobs is complete to perform some accounting and publish the results to other systems. For both of these usecases, using eventfd to wait for multiple events seems to result in more CPU spinning than the futex-based solution, both in userspace and the kernel.


Thanks, Florian


On 7/31/19 3:01 AM, Szabolcs Nagy wrote:
On 31/07/2019 01:07, Pierre-Loup A. Griffais wrote:
I started putting together a patch to expose the new Linux futex
functionality that recently got proposed for upstream inclusion.
[1]
...

[1] https://lkml.org/lkml/2019/7/30/1399

i don't see that patch on the linux-api list where userspace api
related patches are discussed.

syscalls that have time argument need extra care now that 32bit
targets will get a new 64bit time_t abi.

Thanks, looks like there were compat concerns raised on the kernel side as well; we can copy linux-api for the next patch iteration.


the futex syscall is multiplexed and intricately related to the
pthread implementation so there are many reasons why such patch
should not be accepted into linux before agreement with userspace.

What does that process typically look like, other than raising it on both ends like we did?

Thanks for all the feedback!
 - Pierre-Loup




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]