SIGKILL may no longer work after many SIGCONT/SIGSTOP signals

Takashi Yano takashi.yano@nifty.ne.jp
Tue Nov 19 09:21:52 GMT 2024


On Tue, 12 Nov 2024 10:53:58 +0100
Christian Franke wrote:
> Found with 'stress-ng --cpu-sched' from current stress-ng upstream HEAD:
> 
> Testcase (attached):
> 
> $ gcc -O2 -o manysignals manysignals.c
> 
> $ ./manysignals
> fork() = 1833
> ...
> fork() = 1848
> ...
> kill(1833, 17)
> ...
> kill(1848, 17)
> kill(1833, 9)
> ...
> kill(1848, 9)
> waitpid(1833, ., 0)
> 
> 
> Run this in second terminal:
> 
> $ watch "ps | sed -n '1p;/manysignals/{/sed/d;p}'"
> 
> If 'S' appear in the first column, the child processes likely reached 
> the final SIGSTOP state. This takes some time. The parent process may 
> still hang in first waitpid() but should not.
> 
> If the parent process is aborted with ^C, child processes may be stopped 
> or left behind. Occasionally a child process that can not be stopped by 
> Cygwin (kill -9) is left behind.
> 
> Tested with ancient (i7-2600K) and more recent (i7-14700K) CPU :-)
> 
> 
> Unrelated to the above, but related to 'stress-ng --cpu-sched' which 
> uses sched_get/setscheduler():
> 
> - sched_getscheduler() always returns SCHED_FIFO. As far as I understand 
> Linux sched(7), this is a non-preemptive real-time policy. The 
> preemptive SCHED_RR would possibly a more reasonable value. 
> Unfortunately SCHED_OTHER cannot be used because it would require to 
> ignore the priority.
> 
> - sched_setscheduler() always fails with ENOSYS. It IMO should allow to 
> set 'param->sched_priority' if 'policy' is equal to the value returned 
> by sched_getscheduler().

Thanks for the report and the test case. I'm now looking into
the issue. Please wait a while.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>


More information about the Cygwin mailing list