Error behavior: 4.19.127-rt-alt2.rt54:~# eu-unstrip -n -k eu-unstrip: cannot load kernel symbols: Exec format error Bug investigation: In kernels from 4.14 up to 4.19 (which are still maintained stable versions) there is __entry_SYSCALL_64_trampoline symbol. In /proc/kallsyms is is located just before module symbols and its address is in much lower address space than rest of the kernel. Example 4.19.127-rt54:~# grep -2 -m2 -i ' [rt] ' /proc/kallsyms 000000000009af40 A rt_uncached_list 000000000009af80 A __per_cpu_end ffffffff81000000 T startup_64 <- this will be kernel 'start' address ffffffff81000000 T _stext ffffffff81000000 T _text ffffffff81000030 T secondary_startup_64 4.19.127-rt54:~# grep -2 __entry_SYSCALL_64_trampoline /proc/kallsyms ffffffff8282c000 B _end ffffffff8282c000 B __brk_limit fffffe0000006000 t __entry_SYSCALL_64_trampoline fffffe0000032000 t __entry_SYSCALL_64_trampoline fffffe000005e000 t __entry_SYSCALL_64_trampoline fffffe000008a000 t __entry_SYSCALL_64_trampoline fffffe00000b6000 t __entry_SYSCALL_64_trampoline fffffe00000e2000 t __entry_SYSCALL_64_trampoline fffffe000010e000 t __entry_SYSCALL_64_trampoline fffffe000013a000 t __entry_SYSCALL_64_trampoline fffffe0000166000 t __entry_SYSCALL_64_trampoline fffffe0000192000 t __entry_SYSCALL_64_trampoline fffffe00001be000 t __entry_SYSCALL_64_trampoline fffffe00001ea000 t __entry_SYSCALL_64_trampoline fffffe0000216000 t __entry_SYSCALL_64_trampoline fffffe0000242000 t __entry_SYSCALL_64_trampoline fffffe000026e000 t __entry_SYSCALL_64_trampoline fffffe000029a000 t __entry_SYSCALL_64_trampoline fffffe00002c6000 t __entry_SYSCALL_64_trampoline fffffe00002f2000 t __entry_SYSCALL_64_trampoline fffffe000031e000 t __entry_SYSCALL_64_trampoline fffffe000034a000 t __entry_SYSCALL_64_trampoline fffffe0000376000 t __entry_SYSCALL_64_trampoline fffffe00003a2000 t __entry_SYSCALL_64_trampoline fffffe00003ce000 t __entry_SYSCALL_64_trampoline fffffe00003fa000 t __entry_SYSCALL_64_trampoline fffffe0000426000 t __entry_SYSCALL_64_trampoline fffffe0000452000 t __entry_SYSCALL_64_trampoline fffffe000047e000 t __entry_SYSCALL_64_trampoline fffffe00004aa000 t __entry_SYSCALL_64_trampoline fffffe00004d6000 t __entry_SYSCALL_64_trampoline fffffe0000502000 t __entry_SYSCALL_64_trampoline fffffe000052e000 t __entry_SYSCALL_64_trampoline fffffe000055a000 t __entry_SYSCALL_64_trampoline ffffffffc06ec024 r _note_6 [amd64_edac_mod] ffffffffc06ec040 r __ksymtab_amd64_get_dram_hole_info [amd64_edac_mod] Root of he problem is in the libdwfl/linux-kernel-modules.c::intuit_kernel_bounds() It is using read_address() until false is returned, which is returned when module symmbol is detected (determined by presence of ']' as the last char of the address line). Last successfuly read address is assumed to be 'end' address of the kernel space. In my example ffffffff8282c000 thould be the 'end' address, but, with __entry_SYSCALL_64_trampoline present the 'end' address is assumed to be fffffe000055a000. This make intuit_kernel_bounds() return ENOEXEC. Which is reported by eu-unstrip. Possible fix: Just ignore __entry_SYSCALL_64_trampoline in read_address(). Example patch: diff --git libdwfl/linux-kernel-modules.c libdwfl/linux-kernel-modules.c index 84a05f28..8c01ce13 100644 --- libdwfl/linux-kernel-modules.c +++ libdwfl/linux-kernel-modules.c @@ -502,12 +502,18 @@ struct read_address_state { const char *type; }; +#define ENTRY_TRAMPOLINE_NAME "__entry_SYSCALL_64_trampoline\n" + static inline bool read_address (struct read_address_state *state, Dwarf_Addr *addr) { if ((state->n = getline (&state->line, &state->linesz, state->f)) < 1 || state->line[state->n - 2] == ']') return false; + if (state->n > sizeof(ENTRY_TRAMPOLINE_NAME) && + !strcmp(ENTRY_TRAMPOLINE_NAME, + state->line + state->n - sizeof(ENTRY_TRAMPOLINE_NAME) + 1)) + return false; *addr = strtoull (state->line, &state->p, 16); state->p += strspn (state->p, " \t"); state->type = strsep (&state->p, " \t\n"); After suggested fix is applied: 4.19.127-rt-alt2.rt54:~/src/elfutils# LD_LIBRARY_PATH=libdw src/unstrip -n -k 0xffffffff81000000+0x182c000 50fc00d319e70ef32e231476c0a636d9622f7764@0xffffffff81a031e4 /usr/lib/debug/lib/modules/4.19.127-rt-alt2.rt54/vmlinux /usr/lib/debug/usr/lib/debug/lib/modules/4.19.127-rt-alt2.rt54/vmlinux.debug kernel 0xffffffffc06e9000+0x9000 7192935572c1296e7e4f5bc3de374c3eb21d1643@0xffffffffc06ec010 /lib/modules/4.19.127-rt-alt2.rt54/kernel/drivers/edac/amd64_edac_mod.ko.xz /usr/lib/debug/lib/modules/4.19.127-rt-alt2.rt54/kernel/drivers/edac/amd64_edac_mod.ko.debug amd64_edac_mod ...
Thanks for the analysis. These __entry_SYSCALL_64_trampoline are really odd. Your suggested fix works, but I would like to not depend on the specific symbol name if at all possible. Ad you say the problem is that they come after the last kernel address, but before the module addresses. And they are smaller than the start address we found. The following check that the address read is larger than the start address we found seems to fix it for me. Does it work for you? diff --git a/libdwfl/linux-kernel-modules.c b/libdwfl/linux-kernel-modules.c index 84a05f28..5e6cf275 100644 --- a/libdwfl/linux-kernel-modules.c +++ b/libdwfl/linux-kernel-modules.c @@ -538,10 +538,14 @@ intuit_kernel_bounds (Dwarf_Addr *start, Dwarf_Addr *end, Dwarf_Addr *notes) if (result == 0) { + Dwarf_Addr addr; *end = *start; - while (read_address (&state, end)) - if (*notes == 0 && !strcmp (state.p, "__start_notes\n")) - *notes = *end; + while (read_address (&state, &addr) && addr >= *start) + { + *end = addr; + if (*notes == 0 && !strcmp (state.p, "__start_notes\n")) + *notes = *end; + } Dwarf_Addr round_kernel = sysconf (_SC_PAGESIZE); *start &= -(Dwarf_Addr) round_kernel;
Yes, your fix works for me too. Thanks.
Thanks for the report, analysis and testing. commit eff30a6dabe52ac77ee5c6a0d31853fc8e3aeadb Author: Mark Wielaard <mark@klomp.org> Date: Sun Jun 28 15:27:25 2020 +0200 libdwfl: read_address should use increasing address in intuit_kernel_bounds In kernels from 4.14 up to 4.19 in /proc/kallsyms there are special __entry_SYSCALL_64_trampoline symbols. The problem is that they come after the last kernel address, but before the module addresses. And they are (much) smaller than the start address we found. This confuses intuit_kernel_bounds and makes it fail. Make sure to check read_address returns an increasing address when searching for the end. https://sourceware.org/bugzilla/show_bug.cgi?id=26177 Reported-by: Vitaly Chikunov <vt@altlinux.org> Signed-off-by: Mark Wielaard <mark@klomp.org>
Thanks much!