This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: What *is* the API for sched_getaffinity? Should sched_getaffinity always succeed when using cpu_set_t?
- From: Roland McGrath <roland at hack dot frob dot com>
- To: "Carlos O'Donell" <carlos at redhat dot com>
- Cc: KOSAKI Motohiro <kosaki dot motohiro at gmail dot com>, libc-alpha <libc-alpha at sourceware dot org>
- Date: Tue, 23 Jul 2013 14:48:22 -0700 (PDT)
- Subject: Re: What *is* the API for sched_getaffinity? Should sched_getaffinity always succeed when using cpu_set_t?
- References: <51E42BFE dot 7000301 at redhat dot com> <51E4A0BB dot 2070802 at gmail dot com> <51E4A123 dot 9070001 at gmail dot com> <51E6F3ED dot 8000502 at redhat dot com> <51E6F956 dot 5050902 at gmail dot com> <51E714DE dot 6060802 at redhat dot com> <CAHGf_=oZW3kNA3V-9u+BZNs3tL3JKCsO2a0Q6f0iJzo=N4Wb8w at mail dot gmail dot com> <51E7B205 dot 3060905 at redhat dot com> <20130722214335 dot D9AFF2C06F at topped-with-meat dot com> <51EDB378 dot 8070301 at redhat dot com> <20130722224553 dot 933BA2C070 at topped-with-meat dot com> <51EDB993 dot 9000204 at redhat dot com>
> On 07/22/2013 06:45 PM, Roland McGrath wrote:
> > I have a hard time seeing why (b) would ever be useful.
> > I think (c) was always the intended semantic of _SC_NPROCESSORS_CONF.
>
> That's different than what we have implemented in glibc.
Bug.
> Why do you have a hard time seeing that (b) would be useful?
It doesn't give you information that you can actually use in any
reliable way. If it's not an upper bound for what _SC_NPROCESSORS_ONLN
might report, then you don't know any such upper bound and the only way
you can ever cope with _SC_NPROCESSORS_ONLN values increasing in the
future is to use on-demand dynamic allocation when _SC_NPROCESSORS_ONLN
does change. At best, that's overly complicated to implement.
> I see (b) being useful for:
>
> * Detection of number of logical cpus that are in the
> system vs. number that are online.
> - Ask your admin to bring the rest of them online?
Do applications really need a canonical interface for this? That's a
purely administrative issue well outside the scope of things that an
application can ordinarily do anything useful with. And how is (b) any
more what you want than (c) is for this purpose? Ask your admin to
bring more online; ask your admin to plug more in and then bring them
online.
> * Used to create a minimally sized structure to track
> per-logical-CPU data.
> - As it is implemented _SC_NPROCESSORS_CONF is a minimal
> value. Fixing it to match your expected semantics e.g
> making it the number of possible CPUs, is going
> to make this value potentially much larger.
How is this a useful size for anything? The only per-CPU data an
application might maintain is about CPUs it can actually use. If
that's what it's doing, then _SC_NPROCESSORS_ONLN is what it wants.
If it wants to prepare a data structure that will be able to hold all
the per-CPU data for all CPUs that it might encounter during the life
of the process, then (b) is insufficient and only (c) is useful.
> What use is there to knowing (c) except to choose to optimize
> space vs. time and allocate sufficient resources to track all
> possible cpus that system could have (only a reboot can change
> this in linux right now)?
Your previous description said that CPU hotplug would change the value.
If that's not true, then I have no idea what distinguishes your (b) from
your (c) in any way that is remotely meaningful to application software.
I have not seen anything to dissuade me from the position that the
definitions I gave earlier are the only ones that make any kind of
worthwhile sense to an application:
1. Amount of hardware parallelism currently available (_SC_NPROCESSORS_ONLN)
2. Upper bound on values of #1 in the life of a process (_SC_NPROCESSORS_CONF)
Consider what it would look like if you proposed _SC_NPROCESSORS_*
values with specified meanings for a future POSIX.1 revision (which is
indeed something we should arrange to get done). If it's not
meaningful in terms of characteristics of the system that a conforming
application can observe, then it doesn't belong in the standard. The
standard doesn't describe how its calls relate to abstract notions or
to concrete hardware that happens to underlie the implementation. It
describes how its calls relate to the behavior of the system that can
be observed by conforming applications. How, other than the two
definitions I gave above, would you define any parameter in this family?
Thanks,
Roland