This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libc/15630] Fix use of cpu_set_t with sched_getaffinity when booted on a system with more than 1024 possible cpus.


https://sourceware.org/bugzilla/show_bug.cgi?id=15630

--- Comment #6 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Florian Weimer from comment #2)
> Maybe addressing this should also fix this design issue with the cpu_set_t
> functions:
> 
> CPU_ALLOC returns a set which is capable of storing more CPUs than the user
> requested because the request is rounded up to the next multiple of sizeof
> (long) * CHAR_BIT CPUs.  This is reflected in the return value of
> CPU_ALLOC_SIZE.  As a result, the sched_getaffinity call in
> 
>   set = CPU_ALLOC (count);
>   sched_getaffinity (0, CPU_ALLOC_SIZE (count), set);
> 
> can succeed even if count is smaller than the maximum number of relevant
> CPUs.  This defeats the purpose of the EINVAL error return code in case the
> specified CPU set is not large enough.

Agreed. That is a flaw. The allocation should certainly be large enough to hold
all possible CPUs ever (nr_cpumask_bits in kernel speak).

(In reply to Florian Weimer from comment #3)
> Additional details:
> 
> Sadly, sysconf(_SC_PROCESSORS_ONLN) not the number of CPUs which is relevant
> to the sched_getaffinity system call.  I have system which reports
> _NPROCESSORS_ONLN and _NPROCESSORS_CONF as 40 (and /proc/cpuinfo and
> /proc/stat match that), yet calling sched_getaffinity with small arguments
> fails:
> 
> [pid  3420] sched_getaffinity(0, 8, 0x146b010) = -1 EINVAL (Invalid argument)
> [pid  3420] sched_getaffinity(0, 16, 0x146b010) = -1 EINVAL (Invalid
> argument)
> [pid  3420] sched_getaffinity(0, 32, {ffffffffff, 0, 0, 0}) = 32
> 
> The kernel seems to operate with a nr_cpu_ids value of 240:
> 
> kernel: setup_percpu: NR_CPUS:5120 nr_cpumask_bits:240 nr_cpu_ids:240
> nr_node_ids:2
> kernel:         RCU restricting CPUs from NR_CPUS=5120 to nr_cpu_ids=240.
> 
> nr_cpu_ids is not directly exposed to user space, I think, so the EINVAL
> behavior is pretty much required.

It is not. We need to get access to nr_cpumask_bits, and likely the only way to
do that is read it from proc like some of the code does today.

See this thread:
https://sourceware.org/ml/libc-alpha/2013-07/msg00288.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]