Bug 31294 - gcores do not work in 32-bit arm targets
Summary: gcores do not work in 32-bit arm targets
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: corefiles (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: 15.1
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-01-25 15:25 UTC by Guinevere Larsen
Modified: 2024-03-12 07:04 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Guinevere Larsen 2024-01-25 15:25:00 UTC
The newly introduced test gdb.threads/threadcrash.exp has revealed a pre-existing issue in 32 bit arm.

using "info threads" on a regular corefile gives the following thread list:

info threads
  Id   Target Id                      Frame 
* 1    Thread 0xf7dbe7e0 (LWP 476389) 0x00830cea in crash_function () at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:381
  2    Thread 0xf7c6f3a0 (LWP 476390) do_spin_task (location=NORMAL) at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139
  3    Thread 0xf746e3a0 (LWP 476391) do_spin_task (location=SIGNAL_HANDLER) at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139
  4    Thread 0xf6c6d3a0 (LWP 476392) do_spin_task (location=SIGNAL_ALT_STACK) at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139
  5    Thread 0xf52fe3a0 (LWP 476395) __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  6    Thread 0xf646c3a0 (LWP 476393) __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  7    Thread 0xf5aff3a0 (LWP 476394) __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46

Whereas the same command when loading a gcore yields:

info threads
  Id   Target Id         Frame 
* 1    LWP 476440        0x00400cea in crash_function () at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:381
  2    LWP 476442        do_spin_task (location=NORMAL) at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139
  3    LWP 476443        do_spin_task (location=SIGNAL_HANDLER) at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139
  4    LWP 476444        do_spin_task (location=SIGNAL_ALT_STACK) at /home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139
  5    LWP 476445        0xf7eadb04 in ?? ()
  6    LWP 476446        0xf7eadb04 in ?? ()
  7    LWP 476447        0xf7eadb04 in ?? ()

Notice how the threads are in the same order, and threads 5, 6 and 7 all fail the unwinding. This can be shown in the Linaro CI bug GNU-1120[1], even though the main focus of the bug is unrelated to this.

I'm opening this mostly as a papertrail to add a KFAIL to the test, but it should probably be fixed at some point.

[1] https://linaro.atlassian.net/browse/GNU-1120
Comment 1 Guinevere Larsen 2024-01-30 14:33:24 UTC
The rest of gdb.threads/threadcrash.exp gcore section has even worse results. The log for "thread apply all backtrace" is as follows:

thread apply all backtrace^M
^M
Thread 7 (LWP 776476):^M
#0  0xf7eadb04 in ?? ()^M
#1  0xf7f13a7e in ?? ()^M
Backtrace stopped: previous frame identical to this frame (corrupt stack?)^M
^M
Thread 6 (LWP 776475):^M
#0  0xf7eadb04 in ?? ()^M
#1  0xf7f13a7e in ?? ()^M
Backtrace stopped: previous frame identical to this frame (corrupt stack?)^M
^M
Thread 5 (LWP 776474):^M
#0  0xf7eadb04 in ?? ()^M
#1  0xf7f13a7e in ?? ()^M
Backtrace stopped: previous frame identical to this frame (corrupt stack?)^M
^M
Thread 4 (LWP 776473):^M
#0  do_spin_task (location=SIGNAL_ALT_STACK) at /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139^M
#1  0x00400a5e in signal_handler (signo=10) at /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:241^M
#2  <signal handler called>^M
#3  0xf7eadb06 in ?? ()^M
#4  0xf7eed292 in ?? ()^M
Backtrace stopped: previous frame identical to this frame (corrupt stack?)^M
^M
Thread 3 (LWP 776472):^M
#0  do_spin_task (location=SIGNAL_HANDLER) at /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:139^M
#1  0x00400a5e in signal_handler (signo=10) at /home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gdb.git~master/gdb/testsuite/gdb.threads/threadcrash.c:241^M
#2  <signal handler called>^M
#3  0xf7eadb06 in ?? ()^M
#4  0xf7eed292 in ?? ()^M
Backtrace stopped: previous frame identical to this frame (corrupt stack?)^M

Seems that either the writing or reading of gcores is thoroughly broken.
Comment 2 Thiago Jung Bauermann 2024-01-31 03:54:13 UTC
I'm currently looking into bug #31254 which has similar symptoms.

I narrowed the problem in that bug down to the "arm exidx" unwinder, which uses C++ exception tables to unwind frames. I think there's a memory corruption issue with one of its data structures. I'm hoping to have more information tomorrow.
Comment 3 Thiago Jung Bauermann 2024-02-01 19:12:23 UTC
(In reply to Thiago Jung Bauermann from comment #2)
> I'm currently looking into bug #31254 which has similar symptoms.

Tom de Vries posted a fix to that bug, but even with it applied, I can still reproduce this problem so they are different issues.
Comment 4 Thiago Jung Bauermann 2024-03-11 23:01:06 UTC
I can't reproduce this problem anymore, so I did git bisect, which found that commit 9c0aa4c53104 ("Fix disabling of year 2038 support on 32-bit hosts by default") fixed the bug.
Comment 5 Tom de Vries 2024-03-12 07:04:29 UTC
Fixed.