This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: What *is* the API for sched_getaffinity? Should sched_getaffinity always succeed when using cpu_set_t?
- From: "Michael Kerrisk (man-pages)" <mtk dot manpages at gmail dot com>
- To: Florian Weimer <fweimer at redhat dot com>
- Cc: "Carlos O'Donell" <carlos at redhat dot com>, Roland McGrath <roland at hack dot frob dot com>, KOSAKI Motohiro <kosaki dot motohiro at gmail dot com>, libc-alpha <libc-alpha at sourceware dot org>, linux-man <linux-man at vger dot kernel dot org>
- Date: Wed, 22 Jul 2015 18:43:16 +0200
- Subject: Re: What *is* the API for sched_getaffinity? Should sched_getaffinity always succeed when using cpu_set_t?
- Authentication-results: sourceware.org; auth=none
- References: <51E42BFE dot 7000301 at redhat dot com> <51E4A0BB dot 2070802 at gmail dot com> <51E4A123 dot 9070001 at gmail dot com> <51E6F3ED dot 8000502 at redhat dot com> <51E6F956 dot 5050902 at gmail dot com> <51E714DE dot 6060802 at redhat dot com> <CAHGf_=oZW3kNA3V-9u+BZNs3tL3JKCsO2a0Q6f0iJzo=N4Wb8w at mail dot gmail dot com> <51E7B205 dot 3060905 at redhat dot com> <20130722214335 dot D9AFF2C06F at topped-with-meat dot com> <51EDB378 dot 8070301 at redhat dot com> <558D6171 dot 1060901 at gmail dot com> <558DB0A0 dot 2040707 at gmail dot com> <5593DF14 dot 2060804 at redhat dot com> <55AE5F33 dot 3080105 at gmail dot com> <55AFBE87 dot 1040006 at redhat dot com>
- Reply-to: mtk dot manpages at gmail dot com
Hello Florian,
On 22 July 2015 at 18:02, Florian Weimer <fweimer@redhat.com> wrote:
> On 07/21/2015 05:03 PM, Michael Kerrisk (man-pages) wrote:
>> Hello Florian,
>>
>> Thanks for your comments, and sorry for the delayed follow-up.
>>
>> On 07/01/2015 02:37 PM, Florian Weimer wrote:
>>> On 06/26/2015 10:05 PM, Michael Kerrisk (man-pages) wrote:
>>>
>>>> +.SS Handling systems with more than 1024 CPUs
>>>> +The
>>>> +.I cpu_set_t
>>>> +data type used by glibc has a fixed size of 128 bytes,
>>>> +meaning that the maximum CPU number that can be represented is 1023.
>>>> +.\" FIXME . See https://sourceware.org/bugzilla/show_bug.cgi?id=15630
>>>> +.\" and https://sourceware.org/ml/libc-alpha/2013-07/msg00288.html
>>>> +If the system has more than 1024 CPUs, then calls of the form:
>>>> +
>>>> + sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
>>>> +
>>>> +will fail with the error
>>>> +.BR EINVAL ,
>>>> +the error produced by the underlying system call for the case where the
>>>> +.I mask
>>>> +size specified in
>>>> +.I cpusetsize
>>>> +is smaller than the size of the affinity mask used by the kernel.
>>>
>>> I think it is best to leave this as unspecified as possible. Kernel
>>> behavior already changed once, and I can imagine it changing again.
>>
>> Hmmm. Something needs to be said about what the kernel is doing though.
>> Otherwise, it's hard to make sense of this subsection. Did you have a
>> suggested rewording that removes the piece you find problematic?
>
> What about this?
>
> âIf the kernel affinity mask is larger than 1024 then
> â
> is smaller than the size of the affinity mask used by the kernel.
> Depending on the system CPU topology, the kernel affinity mask can
> be substantially larger than the number of active CPUs in the system.
> â
Looks good. I've taken that.
> I.e., make clear that the size of the mask can be quite different from
> the CPU count.
>
>> Handling systems with more than 1024 CPUs
>> The underlying system calls (which represent CPU masks as bit
>> masks of type unsigned long *) impose no restriction on the
>> size of the CPU mask. However, the cpu_set_t data type used by
>> glibc has a fixed size of 128 bytes, meaning that the maximum
>> CPU number that can be represented is 1023. If the system has
>> more than 1024 CPUs, then calls of the form:
>>
>> sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
>>
>> will fail with the error EINVAL, the error produced by the
>> underlying system call for the case where the mask size speciâ
>> fied in cpusetsize is smaller than the size of the affinity
>> mask used by the kernel.
>>
>> When working on systems with more than 1024 CPUs, one must
>> dynamically allocate the mask argument. Currently, the only
>> way to do this is by probing for the size of the required mask
>> using sched_getaffinity() calls with increasing mask sizes
>> (until the call does not fail with the error EINVAL).
>>
>> Better?
>
> âmore than 1024 CPUsâ should be âlarge [kernel CPU] affinity masksâ
> throughout.
Done.
Thanks for your further input. So now we have:
C library/kernel differences
This manual page describes the glibc interface for the CPU affinâ
ity calls. The actual system call interface is slightly differâ
ent, with the mask being typed as unsigned long *, reflecting the
fact that the underlying implementation of CPU sets is a simple
bit mask. On success, the raw sched_getaffinity() system call
returns the size (in bytes) of the cpumask_t data type that is
used internally by the kernel to represent the CPU set bit mask.
Handling systems with large CPU affinity masks
The underlying system calls (which represent CPU masks as bit
masks of type unsigned long *) impose no restriction on the size
of the CPU mask. However, the cpu_set_t data type used by glibc
has a fixed size of 128 bytes, meaning that the maximum CPU numâ
ber that can be represented is 1023. If the kernel CPU affinity
mask is larger than 1024, then calls of the form:
sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
will fail with the error EINVAL, the error produced by the underâ
lying system call for the case where the mask size specified in
cpusetsize is smaller than the size of the affinity mask used by
the kernel. (Depending on the system CPU topology, the kernel
affinity mask can be substantially larger than the number of
active CPUs in the system.)
When working on systems with large kernel CPU affinity masks, one
must dynamically allocate the mask argument. Currently, the only
way to do this is by probing for the size of the required mask
using sched_getaffinity() calls with increasing mask sizes (until
the call does not fail with the error EINVAL).
Cheers,
Michael
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/