The newly introduced test gdb.threads/threadcrash.exp has revealed a pre-existing issue in 32 bit arm. using "info threads" on a regular corefile gives the following thread list: info threads Id Target Id Frame * 1 Thread 0xf7dbe7e0 (LWP 476389) 0x00830cea in crash_function () at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:381 2 Thread 0xf7c6f3a0 (LWP 476390) do_spin_task (location=NORMAL) at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139 3 Thread 0xf746e3a0 (LWP 476391) do_spin_task (location=SIGNAL_HANDLER) at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139 4 Thread 0xf6c6d3a0 (LWP 476392) do_spin_task (location=SIGNAL_ALT_STACK) at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139 5 Thread 0xf52fe3a0 (LWP 476395) __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46 6 Thread 0xf646c3a0 (LWP 476393) __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46 7 Thread 0xf5aff3a0 (LWP 476394) __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46 Whereas the same command when loading a gcore yields: info threads Id Target Id Frame * 1 LWP 476440 0x00400cea in crash_function () at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:381 2 LWP 476442 do_spin_task (location=NORMAL) at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139 3 LWP 476443 do_spin_task (location=SIGNAL_HANDLER) at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139 4 LWP 476444 do_spin_task (location=SIGNAL_ALT_STACK) at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139 5 LWP 476445 0xf7eadb04 in ?? () 6 LWP 476446 0xf7eadb04 in ?? () 7 LWP 476447 0xf7eadb04 in ?? () Notice how the threads are in the same order, and threads 5, 6 and 7 all fail the unwinding. This can be shown in the Linaro CI bug GNU-1120[1], even though the main focus of the bug is unrelated to this. I'm opening this mostly as a papertrail to add a KFAIL to the test, but it should probably be fixed at some point. [1] https://linaro.atlassian.net/browse/GNU-1120
The rest of gdb.threads/threadcrash.exp gcore section has even worse results. The log for "thread apply all backtrace" is as follows: thread apply all backtrace^M ^M Thread 7 (LWP 776476):^M #0 0xf7eadb04 in ?? ()^M #1 0xf7f13a7e in ?? ()^M Backtrace stopped: previous frame identical to this frame (corrupt stack?)^M ^M Thread 6 (LWP 776475):^M #0 0xf7eadb04 in ?? ()^M #1 0xf7f13a7e in ?? ()^M Backtrace stopped: previous frame identical to this frame (corrupt stack?)^M ^M Thread 5 (LWP 776474):^M #0 0xf7eadb04 in ?? ()^M #1 0xf7f13a7e in ?? ()^M Backtrace stopped: previous frame identical to this frame (corrupt stack?)^M ^M Thread 4 (LWP 776473):^M #0 do_spin_task (location=SIGNAL_ALT_STACK) at /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139^M #1 0x00400a5e in signal_handler (signo=10) at /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:241^M #2 <signal handler called>^M #3 0xf7eadb06 in ?? ()^M #4 0xf7eed292 in ?? ()^M Backtrace stopped: previous frame identical to this frame (corrupt stack?)^M ^M Thread 3 (LWP 776472):^M #0 do_spin_task (location=SIGNAL_HANDLER) at /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139^M #1 0x00400a5e in signal_handler (signo=10) at /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:241^M #2 <signal handler called>^M #3 0xf7eadb06 in ?? ()^M #4 0xf7eed292 in ?? ()^M Backtrace stopped: previous frame identical to this frame (corrupt stack?)^M Seems that either the writing or reading of gcores is thoroughly broken.
I'm currently looking into bug #31254 which has similar symptoms. I narrowed the problem in that bug down to the "arm exidx" unwinder, which uses C++ exception tables to unwind frames. I think there's a memory corruption issue with one of its data structures. I'm hoping to have more information tomorrow.
(In reply to Thiago Jung Bauermann from comment #2) > I'm currently looking into bug #31254 which has similar symptoms. Tom de Vries posted a fix to that bug, but even with it applied, I can still reproduce this problem so they are different issues.
I can't reproduce this problem anymore, so I did git bisect, which found that commit 9c0aa4c53104 ("Fix disabling of year 2038 support on 32-bit hosts by default") fixed the bug.
Fixed.