Bug 23732 - Test failures on sparc64-linux-gnu
Summary: Test failures on sparc64-linux-gnu
Product: elfutils
Component: general
Assignee: Jose E. Marchesi
Reported: 2018-10-02 16:50 UTC by Frank Schaefer
Modified: 2023-01-18 14:41 UTC (History)
Last reconfirmed: 2018-10-03


Description Frank Schaefer 2018-10-02 16:50:36 UTC
elfutils-0.174 (and at least as far back as 0.170) is failing three tests for me:

   elfutils 0.174: tests/test-suite.log

# TOTAL: 202
# PASS:  194
# SKIP:  5
# XFAIL: 0
# FAIL:  3
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: run-strip-nothing.sh

/usr/src/elfutils-0.174/src/elfcmp: a.out strip.out differ: section [5] '.dynsym' header
*** failed strip.out different from a.out
FAIL run-strip-nothing.sh (exit status: 255)

FAIL: run-backtrace-native.sh

0x10000000000   0x10000104000   /usr/src/elfutils-0.174/tests/backtrace-child
0xffff800100000000      0xffff800100126000      /lib/sparc64-linux-gnu/ld-2.27.so
0xffff800100128000      0xffff800100244000      /lib/sparc64-linux-gnu/libpthread-2.27.so
0xffff80010047a000      0xffff80010047c000      [vdso: 91062]
0xffff80010047c000      0xffff8001006e8000      /lib/sparc64-linux-gnu/libc-2.27.so
TID 91062:
# 0 0xffff80010013beb0          raise
TID 91067:
# 0 0xffff80010013beb0          raise
/usr/src/elfutils-0.174/tests/backtrace: dwfl_thread_getframes: No DWARF information found
/usr/src/elfutils-0.174/tests/backtrace: dwfl_thread_getframes: No DWARF information found
backtrace: backtrace.c:81: callback_verify: Assertion `seen_main' failed.
./test-subr.sh: line 84: 91057 Aborted                 (core dumped) LD_LIBRARY_PATH="${built_library_path}${LD_LIBRARY_PATH:+:}$LD_LIBRARY_PATH" $VALGRIND_CMD "$@"
backtrace-child: no main
FAIL run-backtrace-native.sh (exit status: 1)

SKIP: run-backtrace-data.sh

/usr/src/elfutils-0.174/tests/backtrace-data: Unwinding not supported for this architecture
data: arch not supported
SKIP run-backtrace-data.sh (exit status: 77)
FAIL: run-backtrace-dwarf.sh

0xffff8001004d88d4      raise
/usr/src/elfutils-0.174/tests/backtrace-dwarf: dwfl_thread_getframes: No DWARF information found
dwarf: no main
FAIL run-backtrace-dwarf.sh (exit status: 1)

SKIP: run-backtrace-native-biarch.sh

biarch testing disabled
SKIP run-backtrace-native-biarch.sh (exit status: 77)

SKIP: run-backtrace-native-core.sh

No match found.
-- Notice: 2 systemd-coredump@.service units are running, output may be incomplete.
No match found.
-- Notice: 1 systemd-coredump@.service unit is running, output may be incomplete.
No match found.
-- Notice: 1 systemd-coredump@.service unit is running, output may be incomplete.
No core.91173 file generated
SKIP run-backtrace-native-core.sh (exit status: 77)

SKIP: run-backtrace-native-core-biarch.sh

biarch testing disabled
SKIP run-backtrace-native-core-biarch.sh (exit status: 77)

SKIP: run-lfs-symbols.sh

LFS testing is irrelevent on this system
SKIP run-lfs-symbols.sh (exit status: 77)

Testsuite summary for elfutils 0.174
# TOTAL: 202
# PASS:  194
# SKIP:  5
# XFAIL: 0
# FAIL:  3
# XPASS: 0
# ERROR: 0

(run-backtrace-native-core-biarch.sh was also failing, but I've put that on a back burner for now while I dig into the other failures.  I can turn it back on if someone wants the failure output).

run-strip-nothing.sh appears to rearrange the ELF section contents without actually deleting anything obvious, but whatever it changes is enough to upset elfcmp.  I actually examined the .dynsym section as best I knew how (with binutils "readelf --dyn-syms"), and the resulting dump looked identical for both a.out and strip.out.

run-backtrace-native.sh is pretty obvious: it doesn't see main() in the backtrace, probably because it can't actually *get* the full backtrace without DWARF info.  run-backtrace-dwarf.sh failure can probably be traced back to the same root cause, and I would assume the related biarch test as well was failing for the same reason.

For background, this is on a mostly-vanilla 4.16.18 kernel, gcc-8.2.0, binutils-2.31.1, glibc-2.27, with SPARC T2000 CPUs.  Typical build flags are "-O2 -m64 -mcpu=v9 -mtune=v9", although I've tried different optimization levels and different (or absent) -mcpu/-mtune flags, with no apparent difference.  gdb is not installed yet, because I'm still in the process of bootstrapping a distro build from scratch.  I can supply the raw test-suite.log on request.
Comment 1 Jose E. Marchesi 2018-10-03 12:33:47 UTC
Hi Frank.

Of the failures you report, I can only reproduce the failure in run-strip-nothing.sh.  Looking at it.

Regarding the failure in the backtrack, it is probably due to the fact the debugging information for glibc is not available in your system.

Comment 2 Jose E. Marchesi 2018-10-04 10:19:17 UTC
Allright, this turned out to be a BFD bug, not an elfutils bug.

Fixed with the binutils commit below.

commit 6d0a6093c5fe82eb4c2b67d3d10fa44eeb0bc98b
Author: Jose E. Marchesi <jose.marchesi@oracle.com>
Date:   Thu Oct 4 02:12:48 2018 -0700

    bfd,sparc: fix the .dynsym sh_index when stripping all symbols in ld
    The SPARC ELF BFD backend uses a hack in order to accomodate the
    STT_REGISTER symbols mandated by the SPARC V9 ABI for 64-bit objects.
    The hack works as follows:
    - Early in `size_dynamic_symbols', it adds the dynamic STT_REGISTER
      symbols and the corresponding DT_SPARC_REGISTER tags if needed,
      i.e. if the input object has been annotated by the assembler to use
      any of the global registers requiring annotations by the ABI.
      The STT_REGISTER symbols are not local, but nevertheless they are
      added to the end of the dynlocal linked list (eek, yes) to be fixed
      "later".  This is done so the symbols are emitted in the symtab.
    - Consequently, when the `sh_info' field of the .dynsym section is
      calculated in `bfd_elf_final_link' to be `local_dynsymcount + 1', it
      may have the wrong value, since the real first global symbol is the
      first STT_REGISTER symbol.
    - However, this temporary inconsistency is fixed in the
      `elf64_sparc_output_arch_syms' backend hook: the sh_index is
      adjusted to its rightful value.  So all is well and good.
    However the 2015 changeset
    commit 8539e4e89eb4c54bb6668582cd709765a3803588
    Author: Alan Modra <amodra@gmail.com>
    Date:   Thu Jan 15 19:42:59 2015 +1030
        Fix ARM fail of gap test
        ld-elf/gap test was failing due to the ARM backend attempting to output
        arch symbols when ld -s (strip all symbols) is in force.  This patch
        stops that happening and tidies the code a little.
    made the `elf_backend_output_arch_syms' backend hook to not be called
    when all symbols are to be stripped.  This resulted in an incorrect
    sh_index for .dynsym when a link is performed with -s (strip_all), in
    64-bit sparc ELF objects.
    This patch moves the sh_index adjusting code from the target
    `output_arch_syms' to `finish_dynamic_sections'.  It also removes the
    strip_all check from `elf64_sparc_output_arch_syms', as the function
    is no longer called in that case.
    Tested in sparc64-linux-gnu and sparc-linux-gnu.
    No regressions observed.
    2018-10-04  Jose E. Marchesi  <jose.marchesi@oracle.com>
            * elf64-sparc.c (elf64_sparc_output_arch_syms): Do not correct the
            impact of STT_REGISTER symbols in the dynsym sh_index here...
            * elfxx-sparc.c (_bfd_sparc_elf_finish_dynamic_sections): ... but
            do it here.