This is the mail archive of the
mailing list for the glibc project.
Re: [RFC PATCH] getcpu_cache system call: caching current CPU number (x86)
- From: Mathieu Desnoyers <mathieu dot desnoyers at efficios dot com>
- To: Ben Maurer <bmaurer at fb dot com>
- Cc: Paul Turner <pjt at google dot com>, Andrew Hunter <ahh at google dot com>, Peter Zijlstra <peterz at infradead dot org>, Ingo Molnar <mingo at redhat dot com>, rostedt <rostedt at goodmis dot org>, "Paul E. McKenney" <paulmck at linux dot vnet dot ibm dot com>, Josh Triplett <josh at joshtriplett dot org>, Lai Jiangshan <laijs at cn dot fujitsu dot com>, Linus Torvalds <torvalds at linux-foundation dot org>, Andrew Morton <akpm at linux-foundation dot org>, linux-api <linux-api at vger dot kernel dot org>, libc-alpha <libc-alpha at sourceware dot org>
- Date: Thu, 16 Jul 2015 18:08:08 +0000 (UTC)
- Subject: Re: [RFC PATCH] getcpu_cache system call: caching current CPU number (x86)
- Authentication-results: sourceware.org; auth=none
- References: <1436724386-30909-1-git-send-email-mathieu dot desnoyers at efficios dot com> <5CDDBDF2D36D9F43B9F5E99003F6A0D48D5F39C6 at PRN-MBX02-1 dot TheFacebook dot com> <587954201 dot 31 dot 1436808992876 dot JavaMail dot zimbra at efficios dot com> <5CDDBDF2D36D9F43B9F5E99003F6A0D48D5F5DA0 at PRN-MBX02-1 dot TheFacebook dot com>
----- On Jul 14, 2015, at 5:34 AM, Ben Maurer firstname.lastname@example.org wrote:
> Mathieu Desnoyers wrote:
>> If we invoke this per-thread registration directly in the glibc NPTL
>> in start_thread, do you think it would fit your requirements ?
> I guess this would basically be transparent to the user -- we'd just need to
> make sure that the registration happens very early, before any chance of
> calling malloc.
Yes, this is my thinking too.
> That said, having the ability for the kernel to understand that TLS
> implementation are laid out using the same offset on each thread seems like
> something that could be valuable long term. Doing so makes it possible to build
> other TLS-based features without forcing each thread to be registered.
AFAIU, using a fixed hardcoded ABI between kernel and user-space might make
transition from the pre-existing ABI (where this memory area is not
reserved) a bit tricky without registering the area, or getting a "feature"
flag, through a system call.
The related question then becomes: should we issue this system call once
per process, or once per thread at thread creation ? Issuing it once per
thread is marginally more costly for thread creation, but seems to be
easier to deal with internally within the kernel.
We could however ensure that only a single system call is needed per new-coming
thread, rather than one system call per feature. One way to do this would be
to register an area that may contain more than just the CPU id. It could
consist of an expandable structure with fixed offsets. When registered, we
could pass the size of that structure as an argument to the system call, so
the kernel knows which features are expected by user-space.