Hang or crash after multiple SIGILL or SIGSEGV and siglongjmp
Takashi Yano
takashi.yano@nifty.ne.jp
Fri May 2 11:25:37 GMT 2025
On Tue, 25 Mar 2025 14:38:35 +0100
Christian Franke wrote:
> Found because 'stress-ng --priv-instr ...' hangs and then requires
> '/bin/kill --force ...':
>
> Testcase with
> [PATCH v2] Cygwin: signal: Copy context to alternate stack in the
> SA_ONSTACK case
> already applied:
>
> $ uname -r
> 3.7.0-dev-16-g2ef1a37e7823-dirty.x86_64
>
> $ cat loopsigill.c
> #include <setjmp.h>
> #include <signal.h>
> #include <stdio.h>
> #include <unistd.h>
>
> static volatile sig_atomic_t sigcnt;
> static sigjmp_buf sjb;
>
> static void sighandler(int sig)
> {
> (void)sig;
> ++sigcnt;
> siglongjmp(sjb, 1);
> write(1, "[FAIL]\n", 7);
> }
>
> int main()
> {
> signal(SIGILL, sighandler);
> printf("pid=%d\n", (int)getpid());
>
> while (sigsetjmp(sjb, 1))
> ;
>
> // loop:
> if (sigcnt < 10 || !(sigcnt % 1000))
> printf("%06d\n", sigcnt);
> if (sigcnt >= 100000)
> return 42;
> asm volatile ("invd"); // goto loop;
>
> return 13; // NOT REACHED
> }
>
> $ gcc -o loopsigill loopsigill.c
>
> $ ./loopsigill # may succeed ...
> pid=122
> 000000
> 000001
> ...
> 099000
> 100000
>
> $ echo $?
> 42
>
> $ ./loopsigill # ... or crash silently ...
> pid=130
> 000000
> 000001
> ...
> 026000
> 027000
>
> $ echo $?
> 0
>
> $ ./loopsigill # ... or hang
> pid=135
> 000000
> 000001
> ...
> 037000
> 038000
> [requires '/bin/kill --force ...']
>
> $ strace -o trace.log ./loopsigill # run '/bin/kill --force ...' ASAP!
> pid=142
> 000000
> [always hangs after first signal and fills trace.log quickly]
>
> $ less trace.log
> ...
> 25 25501 [main] loopsigill 142 write: 7 = write(1, 0xA00017710, 7)
> --- Process 6856 (pid: 142), exception c0000096 at 00000001004011b9
> 142 25643 [main] loopsigill 142 exception::handle: In
> cygwin_except_handler exception 0xC0000096 at 0x1004011B9 sp 0x7FFFFCBE0
> 26 25669 [main] loopsigill 142 exception::handle: In
> cygwin_except_handler signal 4 at 0x1004011B9
> 38 25707 [main] loopsigill 142 break_here: break here
> --- Process 6856 (pid: 142), exception c0000096 at 00000001004011b9
> --- Process 6856 (pid: 142), exception c0000096 at 00000001004011b9
> ... likely repeated until disk is full or time_t wraps around...
> --- Process 6856 (pid: 142), exception c0000096 at 00000001004011b9
>
>
> Problem also occurs
> - without the mentioned patch,
> - with get/setcontext() instead of sig*jmp(),
> - with nullptr access and SIGSEGV handler,
> - with Cygwin 3.5.7-1.
>
> I agree that this is not a common use case :-)
Thanks for the report. I'm sorry for keeping you waiting so long.
I finally could fix the issue. I'll push the patch shortly.
--
Takashi Yano <takashi.yano@nifty.ne.jp>
More information about the Cygwin
mailing list