Bug 28991

Summary: sysconf(_SC_NPROCESSORS_CONF) should read /sys/devices/system/cpu/possible
Product: glibc Reporter: Robert O'Callahan <robert>
Component: libcAssignee: Adhemerval Zanella <adhemerval.zanella>
Severity: normal CC: adhemerval.zanella, drepper.fsp
Priority: P2    
Version: unspecified   
Target Milestone: 2.36   
Host: Target:
Build: Last reconfirmed: 2022-05-16 00:00:00

Description Robert O'Callahan 2022-03-23 00:29:12 UTC
Currently on Linux `__get_nprocs_conf` first tries to enumerate `/sys/devices/system/cpu`. This enumerates the CPUs that are present in the system (but possibly offline). However, new CPUs can be added via hotplugging that were not previously enumerated. This is a problem for applications that want to allocate an array and index it by CPU ID (e.g. the `cpu_id` field of `struct rseq`). Currently on Linux the only safe way to allocate such an array is to read `/sys/devices/system/cpu/possible` to determine the array size. Not many applications are doing that.

Rather than require applications to do that (which is non-portable as well as being painful), I think it would make more sense to modify `__get_nprocs_conf` to try reading `/sys/devices/system/cpu/possible` instead of enumerating the CPUs.
Comment 1 Adhemerval Zanella 2022-03-23 11:57:22 UTC
I am not sure if we can actually /sys/devices/system/cpu/possible, since for some architecture it reports possible hotplug cpus that might not actually be present in the system.

For instance, on my system with a Ryzen 5900x (12c, 24t) I see:

$ sudo dmesg
[    0.065499] smpboot: Allowing 32 CPUs, 8 hotplug CPUs
$ cat /sys/devices/system/cpu/possible 
$ cat /sys/devices/system/cpu/online 
$ cat /sys/devices/system/cpu/offline 

And thus using 'possible' will wrongly report 32 cpus on a system with the maximum of 24. 

The Linux source give us some hints why it is doing it:

1513  * Three ways to find out the number of additional hotplug CPUs:
1514  * - If the BIOS specified disabled CPUs in ACPI/mptables use that.
1515  * - The user can overwrite it with possible_cpus=NUM
1516  * - Otherwise don't reserve additional CPUs.
1517  * We do this because additional CPUs waste a lot of memory.

So I am guessing that BIOS in my case it reporting 32 because it supports the Ryzen 5950x, even thought he system does not actually support a large number of cpus.
Comment 2 Adhemerval Zanella 2022-03-23 12:00:09 UTC
I see a similar behaviour on a sparc64 machine:

$ cat /sys/devices/system/cpu/possible
$ cat /sys/devices/system/cpu/online 
$ cat /sys/devices/system/cpu/offline 

So I think reporting the actual presented CPU for __get_nprocs_conf is the correct semantic here.
Comment 3 Andreas Schwab 2022-03-23 12:52:48 UTC
But the systems are configured for 32 and 256 cpus, resp.
Comment 4 Adhemerval Zanella 2022-03-23 13:34:21 UTC
Does it really make sense to report a large number of cpus even if caller won't be able to actually activate them? And can you assure changing this interface won't regress again?
Comment 5 Andreas Schwab 2022-03-23 13:43:34 UTC
How do you know that they cannot be activated?
Comment 6 Adhemerval Zanella 2022-03-23 13:48:12 UTC
Because it is not listed on /sys/devices/system/cpu/cpu*.
Comment 7 Adhemerval Zanella 2022-03-23 13:49:14 UTC
But I am more interested whether it would trigger any regressions, as we recently hit some in recent changes.
Comment 8 Andreas Schwab 2022-03-23 14:11:22 UTC
The are not listed because they aren't currently present.  But they can be hotplugged any time (in theory).
Comment 9 Adhemerval Zanella 2022-03-23 14:33:31 UTC
And that is my question: does it make to report the configured cpu number on a system where hotplug might not be possible? And are the programs that use this interface aware of it?
Comment 10 Andreas Schwab 2022-03-23 14:42:19 UTC
How do you know if hotplug is not possible?
Comment 11 Adhemerval Zanella 2022-03-23 14:48:21 UTC
That's why I put 'might not be possible'. Do you think we should make this change?
Comment 12 Andreas Schwab 2022-03-23 15:15:27 UTC
The same question: how do you know that?  We can only use the information that the system provides.
Comment 13 Robert O'Callahan 2022-03-23 18:43:48 UTC
Applications that allocate an array indexed by rseq cpu_id or getcpu() or similar need to size it to accommodate all possible CPUs. On Linux they have to use /sys/devices/system/cpu/possible, regardless of how accurate that is. (I don't think there's any better source for this information, and if there is, the kernel should use it to make /sys/devices/system/cpu/possible more accurate.)

It's not a big problem if the number of CPUs from /sys/devices/system/cpu/possible is larger than necessary. That will only waste a small amount of memory. The kernel internally is already sizing its per-cpu structures based on cpu_possible_mask.

The only real question is what sysconf(_SC_NPROCESSORS_CONF) should actually mean. If it's intended to be an upper bound for CPU IDs, then it needs to use  /sys/devices/system/cpu/possible. If it isn't, then I don't know what it's useful for, but at least it should have a big warning that it is not an upper bound for CPU IDs.
Comment 14 Robert O'Callahan 2022-03-23 18:52:48 UTC
And if you choose to not use /sys/devices/system/cpu/possible for _SC_NPROCESSORS_CONF, then someone what _SC_NPROCESSORS_CONF is useful for, because I have no idea.
Comment 15 Adhemerval Zanella 2022-05-16 17:36:21 UTC
Fixed on 2.36.