With 2.31, the number of testsuite failures on Linux/sparc64 has dropped dramatically to just three failures. One of the failures left is nptl/tst-mutex8-static and nptl/tst-mutexpi8-static, for a full log see: https://buildd.debian.org/status/fetch.php?pkg=glibc&arch=sparc64&ver=2.31-0experimental0&stamp=1584003885&raw=0 Are these failures which can be safely ignored or do they indicate a larger problem?
The issue seems that libgcc is in an infinite loop trying to unwind the canceled thread: (gdb) thread apply all bt Thread 3 (LWP 421806): #0 binary_search_single_encoding_fdes (pc=0x110343 <kill+35>, ob=0x2e) at /home/azanella/toolchain/src/gcc/libgcc/unwind-dw2-fde.c:936 #1 search_object (ob=ob@entry=0x2a9c18 <object>, pc=pc@entry=0x110343 <kill+35>) at /home/azanella/toolchain/src/gcc/libgcc/unwind-dw2-fde.c:1005 #2 0x0000000000183dc8 in _Unwind_Find_registered_FDE (bases=0xfff8000100806448, pc=0x110343 <kill+35>) at /home/azanella/toolchain/src/gcc/libgcc/unwind-dw2-fde.c:1054 #3 _Unwind_Find_FDE (pc=0x110343 <kill+35>, bases=bases@entry=0xfff8000100806448) at /home/azanella/toolchain/src/gcc/libgcc/unwind-dw2-fde-dip.c:458 #4 0x000000000017fd54 in uw_frame_state_for (context=context@entry=0xfff80001008060f0, fs=fs@entry=0xfff8000100805570) at /home/azanella/toolchain/src/gcc/libgcc/unwind-dw2.c:1249 #5 0x00000000001816dc in _Unwind_ForcedUnwind_Phase2 (exc=exc@entry=0xfff8000100807d70, context=context@entry=0xfff80001008060f0) at /home/azanella/toolchain/src/gcc/libgcc/unwind.inc:155 #6 0x0000000000181d04 in _Unwind_ForcedUnwind (exc=0xfff8000100807d70, stop=stop@entry=0x10a7a0 <unwind_stop>, stop_argument=stop_argument@entry=0xfff8000100806a20) at /home/azanella/toolchain/src/gcc/libgcc/unwind.inc:207 #7 0x000000000010a8e8 in __pthread_unwind (buf=0xfff8000100806a20) at unwind.c:121 #8 0x00000000001097d0 in __do_cancel () at ./pthreadP.h:311 #9 sigcancel_handler (sig=<optimized out>, si=0xfff8000100806700, ctx=0xfff8000100806700) at nptl-init.c:162 #10 <signal handler called> #11 0x000000000010709c in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x2a9c7c <c+44>) at ../sysdeps/nptl/futex-internal.h:183 #12 __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7feffffeaf8, cond=0x2a9c50 <c>) at pthread_cond_wait.c:508 #13 __pthread_cond_wait (cond=cond@entry=0x2a9c50 <c>, mutex=0x7feffffeaf8) at pthread_cond_wait.c:638 #14 0x0000000000101114 in tf (arg=0x1) at ../sysdeps/pthread/tst-mutex8.c:74 #15 0x0000000000103a78 in start_thread (arg=0xfff8000100807900) at pthread_create.c:473 #16 0x000000000013666c in __thread_start () at ../sysdeps/unix/sysv/linux/sparc/sparc64/clone.S:77 Backtrace stopped: previous frame identical to this frame (corrupt stack?) Thread 1 (LWP 421802): #0 0x0000000000104ee4 in __pthread_clockjoin_ex (threadid=14, thread_return=0xe, clockid=<optimized out>, abstime=0xe, block=<optimized out>) at pthread_join_common.c:145 #1 0x0000000000000016 in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) The other issues (nptl/tst-cond8-static, nptl/tst-cancel24-static) seems to follow the same pattern. I am not sure if this is code-generation issue (since the dynamic linked test does not fail) or some missing directive. I thought it might be something related to b33e946fbb1659d2c5937 (sparc: Move sigreturn stub to assembly) due to some missing CFI directive that is messing with libgcc unwind. I tried to use a C implementation that -fexception and -funwind-asynchronous-table, but it didn't change the outcome.
It is the same issue from BZ#31244, where the rewrite done by b33e946fbb1659d2c5937c4dd756a7c49a132dff was not fully correct regarding CFI annotation. I will send a similar fix as proposed to fix the sparc32 issue: diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/sigreturn_stub.S b/sysdeps/unix/sysv/linux/sparc/sparc64/sigreturn_stub.S index 12af289375..3134337e25 100644 --- a/sysdeps/unix/sysv/linux/sparc/sparc64/sigreturn_stub.S +++ b/sysdeps/unix/sysv/linux/sparc/sparc64/sigreturn_stub.S @@ -23,7 +23,10 @@ [1] https://lkml.org/lkml/2016/5/27/465 */ -ENTRY (__rt_sigreturn_stub) + nop + nop + +ENTRY_NOCFI (__rt_sigreturn_stub) mov __NR_rt_sigreturn, %g1 ta 0x6d -END (__rt_sigreturn_stub) +END_NOCFI (__rt_sigreturn_stub) It fixes the regression I saw on sparc64: FAIL: nptl/tst-cancel24-static FAIL: nptl/tst-cond8-static FAIL: nptl/tst-mutex8-static FAIL: nptl/tst-mutexpi8-static FAIL: nptl/tst-mutexpi9 *** This bug has been marked as a duplicate of bug 31244 ***