Bug 26177 - eu-unstrip -n -k fails on kernels 4.14-4.19
Summary: eu-unstrip -n -k fails on kernels 4.14-4.19
Status: RESOLVED FIXED
Alias: None
Product: elfutils
Classification: Unclassified
Component: libdw (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-06-26 21:37 UTC by Vitaly Chikunov
Modified: 2020-06-28 13:46 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2020-06-27 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Vitaly Chikunov 2020-06-26 21:37:05 UTC
Error behavior:

 4.19.127-rt-alt2.rt54:~# eu-unstrip -n -k
 eu-unstrip: cannot load kernel symbols: Exec format error

Bug investigation:

In kernels from 4.14 up to 4.19 (which are still maintained stable versions) there is __entry_SYSCALL_64_trampoline symbol. In /proc/kallsyms is is located just before module symbols and its address is in much lower address space than rest of the kernel. Example

 4.19.127-rt54:~# grep -2 -m2 -i ' [rt] ' /proc/kallsyms
 000000000009af40 A rt_uncached_list
 000000000009af80 A __per_cpu_end
 ffffffff81000000 T startup_64 <- this will be kernel 'start' address
 ffffffff81000000 T _stext
 ffffffff81000000 T _text
 ffffffff81000030 T secondary_startup_64

 4.19.127-rt54:~# grep -2 __entry_SYSCALL_64_trampoline /proc/kallsyms
 ffffffff8282c000 B _end
 ffffffff8282c000 B __brk_limit
 fffffe0000006000 t __entry_SYSCALL_64_trampoline
 fffffe0000032000 t __entry_SYSCALL_64_trampoline
 fffffe000005e000 t __entry_SYSCALL_64_trampoline
 fffffe000008a000 t __entry_SYSCALL_64_trampoline
 fffffe00000b6000 t __entry_SYSCALL_64_trampoline
 fffffe00000e2000 t __entry_SYSCALL_64_trampoline
 fffffe000010e000 t __entry_SYSCALL_64_trampoline
 fffffe000013a000 t __entry_SYSCALL_64_trampoline
 fffffe0000166000 t __entry_SYSCALL_64_trampoline
 fffffe0000192000 t __entry_SYSCALL_64_trampoline
 fffffe00001be000 t __entry_SYSCALL_64_trampoline
 fffffe00001ea000 t __entry_SYSCALL_64_trampoline
 fffffe0000216000 t __entry_SYSCALL_64_trampoline
 fffffe0000242000 t __entry_SYSCALL_64_trampoline
 fffffe000026e000 t __entry_SYSCALL_64_trampoline
 fffffe000029a000 t __entry_SYSCALL_64_trampoline
 fffffe00002c6000 t __entry_SYSCALL_64_trampoline
 fffffe00002f2000 t __entry_SYSCALL_64_trampoline
 fffffe000031e000 t __entry_SYSCALL_64_trampoline
 fffffe000034a000 t __entry_SYSCALL_64_trampoline
 fffffe0000376000 t __entry_SYSCALL_64_trampoline
 fffffe00003a2000 t __entry_SYSCALL_64_trampoline
 fffffe00003ce000 t __entry_SYSCALL_64_trampoline
 fffffe00003fa000 t __entry_SYSCALL_64_trampoline
 fffffe0000426000 t __entry_SYSCALL_64_trampoline
 fffffe0000452000 t __entry_SYSCALL_64_trampoline
 fffffe000047e000 t __entry_SYSCALL_64_trampoline
 fffffe00004aa000 t __entry_SYSCALL_64_trampoline
 fffffe00004d6000 t __entry_SYSCALL_64_trampoline
 fffffe0000502000 t __entry_SYSCALL_64_trampoline
 fffffe000052e000 t __entry_SYSCALL_64_trampoline
 fffffe000055a000 t __entry_SYSCALL_64_trampoline
 ffffffffc06ec024 r _note_6      [amd64_edac_mod]
 ffffffffc06ec040 r __ksymtab_amd64_get_dram_hole_info   [amd64_edac_mod]

Root of he problem is in the libdwfl/linux-kernel-modules.c::intuit_kernel_bounds()

It is using read_address() until false is returned, which is returned when module symmbol is detected (determined by presence of ']' as the last char of the address line). Last successfuly read address is assumed to be 'end' address of the kernel space.

In my example ffffffff8282c000 thould be the 'end' address, but, with __entry_SYSCALL_64_trampoline present the 'end' address is assumed to be fffffe000055a000. This make intuit_kernel_bounds() return ENOEXEC. Which is reported by eu-unstrip.

Possible fix:

Just ignore __entry_SYSCALL_64_trampoline in read_address(). Example patch:

diff --git libdwfl/linux-kernel-modules.c libdwfl/linux-kernel-modules.c
index 84a05f28..8c01ce13 100644
--- libdwfl/linux-kernel-modules.c
+++ libdwfl/linux-kernel-modules.c
@@ -502,12 +502,18 @@ struct read_address_state {
   const char *type;
 };

+#define ENTRY_TRAMPOLINE_NAME "__entry_SYSCALL_64_trampoline\n"
+
 static inline bool
 read_address (struct read_address_state *state, Dwarf_Addr *addr)
 {
   if ((state->n = getline (&state->line, &state->linesz, state->f)) < 1 ||
       state->line[state->n - 2] == ']')
     return false;
+  if (state->n > sizeof(ENTRY_TRAMPOLINE_NAME) &&
+      !strcmp(ENTRY_TRAMPOLINE_NAME,
+             state->line + state->n - sizeof(ENTRY_TRAMPOLINE_NAME) + 1))
+    return false;
   *addr = strtoull (state->line, &state->p, 16);
   state->p += strspn (state->p, " \t");
   state->type = strsep (&state->p, " \t\n");

After suggested fix is applied:

 4.19.127-rt-alt2.rt54:~/src/elfutils# LD_LIBRARY_PATH=libdw src/unstrip -n -k
 0xffffffff81000000+0x182c000 50fc00d319e70ef32e231476c0a636d9622f7764@0xffffffff81a031e4 /usr/lib/debug/lib/modules/4.19.127-rt-alt2.rt54/vmlinux /usr/lib/debug/usr/lib/debug/lib/modules/4.19.127-rt-alt2.rt54/vmlinux.debug kernel
 0xffffffffc06e9000+0x9000 7192935572c1296e7e4f5bc3de374c3eb21d1643@0xffffffffc06ec010 /lib/modules/4.19.127-rt-alt2.rt54/kernel/drivers/edac/amd64_edac_mod.ko.xz /usr/lib/debug/lib/modules/4.19.127-rt-alt2.rt54/kernel/drivers/edac/amd64_edac_mod.ko.debug amd64_edac_mod
 ...
Comment 1 Mark Wielaard 2020-06-27 23:07:48 UTC
Thanks for the analysis. These __entry_SYSCALL_64_trampoline are really odd. Your suggested fix works, but I would like to not depend on the specific symbol name if at all possible.

Ad you say the problem is that they come after the last kernel address, but before the module addresses. And they are smaller than the start address we found.

The following check that the address read is larger than the start address we found seems to fix it for me. Does it work for you?

diff --git a/libdwfl/linux-kernel-modules.c b/libdwfl/linux-kernel-modules.c
index 84a05f28..5e6cf275 100644
--- a/libdwfl/linux-kernel-modules.c
+++ b/libdwfl/linux-kernel-modules.c
@@ -538,10 +538,14 @@ intuit_kernel_bounds (Dwarf_Addr *start, Dwarf_Addr *end, Dwarf_Addr *notes)
 
   if (result == 0)
     {
+      Dwarf_Addr addr;
       *end = *start;
-      while (read_address (&state, end))
-       if (*notes == 0 && !strcmp (state.p, "__start_notes\n"))
-         *notes = *end;
+      while (read_address (&state, &addr) && addr >= *start)
+       {
+         *end = addr;
+         if (*notes == 0 && !strcmp (state.p, "__start_notes\n"))
+           *notes = *end;
+       }
 
       Dwarf_Addr round_kernel = sysconf (_SC_PAGESIZE);
       *start &= -(Dwarf_Addr) round_kernel;
Comment 2 Vitaly Chikunov 2020-06-27 23:12:40 UTC
Yes, your fix works for me too. Thanks.
Comment 3 Mark Wielaard 2020-06-28 13:35:57 UTC
Thanks for the report, analysis and testing.

commit eff30a6dabe52ac77ee5c6a0d31853fc8e3aeadb
Author: Mark Wielaard <mark@klomp.org>
Date:   Sun Jun 28 15:27:25 2020 +0200

    libdwfl: read_address should use increasing address in intuit_kernel_bounds
    
    In kernels from 4.14 up to 4.19 in /proc/kallsyms there are special
    __entry_SYSCALL_64_trampoline symbols. The problem is that they come
    after the last kernel address, but before the module addresses.
    And they are (much) smaller than the start address we found. This
    confuses intuit_kernel_bounds and makes it fail.
    
    Make sure to check read_address returns an increasing address when
    searching for the end.
    
    https://sourceware.org/bugzilla/show_bug.cgi?id=26177
    
    Reported-by: Vitaly Chikunov <vt@altlinux.org>
    Signed-off-by: Mark Wielaard <mark@klomp.org>
Comment 4 Vitaly Chikunov 2020-06-28 13:46:03 UTC
Thanks much!