This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC PATCH] getcpu_cache system call: caching current CPU number (x86)


----- On Jul 21, 2015, at 3:30 AM, OndÅej BÃlka neleai@seznam.cz wrote:

> On Tue, Jul 21, 2015 at 12:25:00AM +0000, Mathieu Desnoyers wrote:
>> >> Does it solve the Wine problem?  If Wine uses gs for something and
>> >> calls a function that does this, Wine still goes boom, right?
>> > 
>> > So the advantage of just making a global segment descriptor available
>> > is that it's not *that* expensive to just save/restore segments. So
>> > either wine could do it, or any library users would do it.
>> > 
>> > But anyway, I'm not sure this is a good idea. The advantage of it is
>> > that the kernel support really is _very_ minimal.
>> 
>> Considering that we'd at least also want this feature on ARM and
>> PowerPC 32/64, and that the gs segment selector approach clashes with
>> existing apps (wine), I'm not sure that implementing a gs segment
>> selector based approach to cpu number caching would lead to an overall
>> decrease in complexity if it leads to performance similar to those of
>> portable approaches.
>> 
>> I'm perfectly fine with architecture-specific tweaks that lead to
>> fast-path speedups, but if we have to bite the bullet and implement
>> an approach based on TLS and registering a memory area at thread start
>> through a system call on other architectures anyway, it might end up
>> being less complex to add a new system call on x86 too, especially if
>> fast path overhead is similar.
>> 
>> But I'm inclined to think that some aspect of the question eludes me,
>> especially given the amount of interest generated by the gs-segment
>> selector approach. What am I missing ?
>> 
> As I wrote before you don't have to bite bullet as I said before. It
> suffices to create 128k element array with cpu for each tid, make that
> mmapable file and userspace could get cpu with nearly same performance
> without hacks.

I don't see how this would be acceptable on memory-constrained embedded
systems. They have multiple cores, and performance requirements, so
having a fast getcpu would be useful there (e.g. telecom industry),
but they clearly cannot afford a 512kB table per process just for that.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]