Bug 19981 - sparc64: Test tst-cond10.out fails with '/bin/sh: 2: Cannot fork'
Summary: sparc64: Test tst-cond10.out fails with '/bin/sh: 2: Cannot fork'
Status: RESOLVED INVALID
Alias: None
Product: glibc
Classification: Unclassified
Component: nptl (show other bugs)
Version: 2.23
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-22 15:16 UTC by John Paul Adrian Glaubitz
Modified: 2016-04-28 14:54 UTC (History)
3 users (show)

See Also:
Host:
Target: sparc*-*-*
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John Paul Adrian Glaubitz 2016-04-22 15:16:55 UTC
Hi!

On git master, building and running 'make check' on sparc64 fails with:

gcc -nostdlib -nostartfiles -o /usr/src/glibc/build/nptl/tst-cond10    -Wl,-z,combreloc -Wl,-z,relro -Wl,--hash-style=both /usr/src/glibc/build/csu/crt1.o /usr/src/glibc/build/csu/crti.o `gcc  --print-file-name=crtbegin.o` /usr/src/glibc/build/nptl/tst-cond10.o /usr/src/glibc/build/nptl/libpthread.so /usr/src/glibc/build/nptl/libpthread_nonshared.a  -Wl,-dynamic-linker=/lib64/ld-linux.so.2 -Wl,-rpath-link=/usr/src/glibc/build:/usr/src/glibc/build/math:/usr/src/glibc/build/elf:/usr/src/glibc/build/dlfcn:/usr/src/glibc/build/nss:/usr/src/glibc/build/nis:/usr/src/glibc/build/rt:/usr/src/glibc/build/resolv:/usr/src/glibc/build/crypt:/usr/src/glibc/build/mathvec:/usr/src/glibc/build/nptl /usr/src/glibc/build/libc.so.6 /usr/src/glibc/build/libc_nonshared.a -Wl,--as-needed /usr/src/glibc/build/elf/ld.so -Wl,--no-as-needed -lgcc -Wl,--as-needed -lgcc_s  -Wl,--no-as-needed `gcc  --print-file-name=crtend.o` /usr/src/glibc/build/csu/crtn.o
env GCONV_PATH=/usr/src/glibc/build/iconvdata LOCPATH=/usr/src/glibc/build/localedata LC_ALL=C   /usr/src/glibc/build/elf/ld-linux.so.2 --library-path /usr/src/glibc/build:/usr/src/glibc/build/math:/usr/src/glibc/build/elf:/usr/src/glibc/build/dlfcn:/usr/src/glibc/build/nss:/usr/src/glibc/build/nis:/usr/src/glibc/build/rt:/usr/src/glibc/build/resolv:/usr/src/glibc/build/crypt:/usr/src/glibc/build/mathvec:/usr/src/glibc/build/nptl /usr/src/glibc/build/nptl/tst-cond10  > /usr/src/glibc/build/nptl/tst-cond10.out; \
../scripts/evaluate-test.sh nptl/tst-cond10 $? false false > /usr/src/glibc/build/nptl/tst-cond10.test-result
/bin/sh: 2: Cannot fork
../Rules:198: recipe for target '/usr/src/glibc/build/nptl/tst-cond10.out' failed
make[2]: *** [/usr/src/glibc/build/nptl/tst-cond10.out] Error 2
make[2]: Leaving directory '/usr/src/glibc/nptl'
Makefile:214: recipe for target 'nptl/tests' failed
make[1]: *** [nptl/tests] Error 2
make[1]: Leaving directory '/usr/src/glibc'
Makefile:9: recipe for target 'check' failed
make: *** [check] Error 2

Built and tested with:

$ mkdir build
$ cd build
$ export CFLAGS='-pipe -O2 -g -mcpu=ultrasparc'
$ ../configure --prefix=/usr
$ make
$ make check

Tested on Debian/sparc64, kernel 4.5.1. The same issue can be observed on the buildds where glibc fails to build as well [1].

Cheers,
Adrian

> [1] https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=sparc64&ver=2.23-0experimental2&stamp=1460783801
Comment 1 Andreas Schwab 2016-04-22 18:27:21 UTC
That is most likely a local resource shortage that has nothing to do with glibc.
Comment 2 John Paul Adrian Glaubitz 2016-04-23 19:57:32 UTC
So, this seems to be a bug in the kernel or the LDOM software.

On the LDOM machine, this bash command triggers the same behavior:

$ eval `for x in $(seq 600) ; do echo "sleep 1 &"; done`
[1] 3699
[2] 3700
[3] 3701
[4] 3702
[5] 3703
[6] 3704
[7] 3705
[8] 3706
[9] 3707
[10] 3708
[11] 3709
[12] 3710
[13] 3711
(...)
[505] 4203
[506] 4204
[507] 4205
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: No child processes
[508] 4208
[509] 4209
[510] 4210
(...)
[588] 4288
[589] 4289
[590] 4290
[591] 4291
[592] 4292
[593] 4293
[594] 4294
[595] 4295
[596] 4296
[597] 4297
[598] 4298
[599] 4299
[600] 4300

The same command works fine on my Sun Blade 100 SPARC machine running the same version of Debian but on a single CPU, single core on bare metal while the above system is an LDOM with 64 GiB RAM and 32 virtual CPUs.

Adrian
Comment 3 Anatoly Pugachev 2016-04-28 12:26:43 UTC
As we've find out yesterday, it was debian sid systemd/cgroups limitation on max of 512 PIDs allowed for processes (per user):

mator@deb4g:~$ grep pid /proc/$$/cgroup 
6:pids:/system.slice/ssh.service

mator@deb4g:~$ cgget -n -g pids /system.slice/ssh.service
pids.max: 512
pids.current: 10

Changing it to a greater value, fixed (bash fork) error:

(runtime):

# echo 2048 > /sys/fs/cgroup/pids/system.slice/ssh.service/pids.max

(permanent, don't forget to restart systemd and ssh service):

mator@deb4g:~$ grep TasksMax /etc/systemd/system.conf
DefaultTasksMax=2048


I suggest to increase systemd/cgroups default compile time value of 512 PIDs per user to some greater value, for example 2048.
Comment 4 John Paul Adrian Glaubitz 2016-04-28 14:54:41 UTC
(In reply to Anatoly Pugachev from comment #3)
> As we've find out yesterday, it was debian sid systemd/cgroups limitation on
> max of 512 PIDs allowed for processes (per user):

Yes, after adjusting DefaultTasksMax in /etc/systemd/system.conf, the problem goes away. The testsuite still fails, however. But that would be for a different bug report.

Adrian