This is the mail archive of the
mailing list for the glibc project.
Re: [RFC PATCH] getcpu_cache system call: caching current CPU number (x86)
- From: Andy Lutomirski <luto at amacapital dot net>
- To: Linus Torvalds <torvalds at linux-foundation dot org>
- Cc: Mathieu Desnoyers <mathieu dot desnoyers at efficios dot com>, Ben Maurer <bmaurer at fb dot com>, Paul Turner <pjt at google dot com>, Andrew Hunter <ahh at google dot com>, Peter Zijlstra <peterz at infradead dot org>, Ingo Molnar <mingo at redhat dot com>, rostedt <rostedt at goodmis dot org>, "Paul E. McKenney" <paulmck at linux dot vnet dot ibm dot com>, Josh Triplett <josh at joshtriplett dot org>, Lai Jiangshan <laijs at cn dot fujitsu dot com>, Andrew Morton <akpm at linux-foundation dot org>, linux-api <linux-api at vger dot kernel dot org>, libc-alpha <libc-alpha at sourceware dot org>
- Date: Fri, 17 Jul 2015 11:55:03 -0700
- Subject: Re: [RFC PATCH] getcpu_cache system call: caching current CPU number (x86)
- Authentication-results: sourceware.org; auth=none
- References: <1436724386-30909-1-git-send-email-mathieu dot desnoyers at efficios dot com> <5CDDBDF2D36D9F43B9F5E99003F6A0D48D5F39C6 at PRN-MBX02-1 dot TheFacebook dot com> <587954201 dot 31 dot 1436808992876 dot JavaMail dot zimbra at efficios dot com> <5CDDBDF2D36D9F43B9F5E99003F6A0D48D5F5DA0 at PRN-MBX02-1 dot TheFacebook dot com> <549319255 dot 383 dot 1437070088597 dot JavaMail dot zimbra at efficios dot com> <CALCETrWEKE=mow3vVh7C4r8CuGy_d5VOEz7KkpijuR5cpBfFtg at mail dot gmail dot com> <CA+55aFz-VBnEKh0SPKgu8xV5=Zb+=6odybVUDoOYOknshbcFJA at mail dot gmail dot com>
On Fri, Jul 17, 2015 at 11:48 AM, Linus Torvalds
> On Thu, Jul 16, 2015 at 12:27 PM, Andy Lutomirski <email@example.com> wrote:
>> If we actually bit the bullet and implemented per-cpu mappings
> That's not ever going to happen.
> Per-cpu page tables are a complete disaster. It's a recipe for crazy
> race conditions, when you have CPUs that update things like
> dirty/accessed bits atomically etc, and you have fundamental races
> when multiple CPU's allocating page tables at the same time (remember:
> we have concurrent page faults, and the locking is not per-vm, it's at
> a finer granularity).
> It's also a big memory management problem when you have lots and lots of CPU's.
> So you can try to prove me wrong, but seriously, I doubt you'll succeed.
I doubt I'll succeed, too. But I don't want anything resembling full
per-cpu page tables -- per-cpu pgds would be plenty. Still kinda
nasty to implement. On the other hand, getting rid of swapgs would be
a nice win.
> On x86, if you want per-cpu memory areas, you should basically plan on
> using segment registers instead (although other odd state has been
> used - there's been the people who use segment limits etc rather than
> the *pointer* itself, preferring to use "lsl" to get percpu data. You
> could also imaging hiding things in the vector state somewhere if you
> control your environment well enough).
I do think we should implement per-cpu descriptor bases or gs bases,
and we should also implement rd/wrfsgsbase. We should do them
together, give them well-defined sematics, and write tests. The
current "segment register state kinda sorta context switches correctly
as long as no one looks carefully" approach is no good. And once we
do that, we don't need a cached cpu number.
Sigh, if we had clean per-cpu memory mappings and got rid of swapgs,
then implementing the fsgsbase stuff would be so much easier.
(Although -- we could plausibly use r15 or something as our percpu
pointer in the kernel without too much loss, which would also get rid
of the fsgsbase mess. Hmm. It would make paranoid entries much
Anyway, I do intend to ask Intel for real per-cpu mappings of some
sort if they ever ask my opinion again. Maybe they'll give us
something in a few decades.