Hi Jessica!
On Fri, 09 Aug, 17:49, Jessica Yu wrote:
As described in commit ad8c2531fb9, the format of the Linux kernel
ksymtab changed in v4.19 to use relative references instead of absolute
references. This changes the type of relocations emitted for ksymtab
sections to be place-relative 32-bit relocations instead of absolute
relocations. One side-effect of this is that libdwfl will not relocate
the ksymtab sections due to the PC-relative relocations. This breaks
load_kernel_symbol_table() for kernel modules because it only reads in
zeros from the unrelocated ksymtab section and is subsequently unable to
determine what exported symbols it refers to. Since a vmlinux binary is
already fully linked and relocated (and therefore we can read its
ksymtab section just fine), this problem is only relevant to Linux
kernel modules.
To work around this, we utilize the ksymtab relocation sections to
determine which symbols the ksymtab entries refer to. We do this by
inspecting each relocation's r_info field for the symbol table index and
from there we are able to read each symbol's value and subsequently add
that to the set of exported symbols.
In addition, for Linux kernel modules, we can utilize relocation types
to implement a new heuristic to determine the ksymtab format we have.
The presence of PC-relative relocations suggest the new v4.19 format,
and absolute relocation types suggest the old pre v4.19 format.
* include/abg-ir.h (elf_symbol::{elf_symbol, create}): Take new
symbol value and shndx parameters.
(elf_symbol::{get_value, get_shndx}): Declare new accessors.
* src/abg-ir.cc (elf_symbol::priv::{value_, shndx_}): New data members.
(elf_symbol::priv::priv): Adjust.
(elf_symbol::elf_symbol): Take new value and shndx parameters.
(elf_symbol::create): Likewise.
(elf_symbol::{get_value, get_shndx}): Define new accessors.
* src/abg-reader.cc (build_elf_symbol): Adjust.
* src/abg-dwarf-reader.cc (lookup_symbol_from_sysv_hash_tab)
(lookup_symbol_from_gnu_hash_tab)
(lookup_symbol_from_symtab): Adjust.
(read_context::{ksymtab_reloc_section_,
ksymtab_gpl_reloc_section_,
ksymtab_strings_section_}): New data members.
(read_context::read_context): Initialize ksymtab_reloc_section_,
ksymtab_gpl_reloc_section_, ksymtab_strings_section_.
(read_context::{find_ksymtab_reloc_section,
find_ksymtab_gpl_reloc_section, find_ksymtab_strings_section,
find_any_ksymtab_reloc_section, get_ksymtab_format_module,
populate_symbol_map_from_ksymtab,
populate_symbol_map_from_ksymtab_reloc,
is_linux_kernel_module}): New member functions.
(read_context::load_kernel_symbol_table): Adjust to call either
populate_symbol_map_from_ksymtab{_reloc,} depending on ksymtab
format.
(read_context::get_ksymtab_format): Adjust to call
get_ksymtab_format_module for linux kernel modules.
(read_context::lookup_elf_symbol_from_index): Adjust.
(create_default_var_sym, create_default_fn_sym): Adjust.
Signed-off-by: Jessica Yu <jeyu@kernel.org>
---
include/abg-ir.h | 10 ++
src/abg-dwarf-reader.cc | 391 +++++++++++++++++++++++++++++++++++++++++-------
src/abg-ir.cc | 38 ++++-
src/abg-reader.cc | 2 +
4 files changed, 388 insertions(+), 53 deletions(-)
diff --git a/include/abg-ir.h b/include/abg-ir.h
index 170a844..d26f472 100644
--- a/include/abg-ir.h
+++ b/include/abg-ir.h
@@ -796,6 +796,8 @@ private:
elf_symbol(const environment* e,
size_t i,
size_t s,
+ uint64_t v,
+ uint32_t x,
I would like to see some comments here to describe the purpose of the
struct's members. I know there weren't any before, but maybe that is a
good chance.
Also, I am not entirely sure why these are all one-character names, but
it makes it hard to come from x to "section header index". Maybe we
should give them more meaningful names.
const string& n,
type t,
binding b,
@@ -818,6 +820,8 @@ public:
create(const environment* e,
size_t i,
size_t s,
+ uint64_t v,
+ uint32_t x,
const string& n,
type t,
binding b,
@@ -838,6 +842,12 @@ public:
void
set_index(size_t);
+ uint64_t
+ get_value() const;
+
+ uint32_t
+ get_shndx() const;
+
const string&
get_name() const;
diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
index 47af7e3..835bf3d 100644
--- a/src/abg-dwarf-reader.cc
+++ b/src/abg-dwarf-reader.cc
@@ -1724,6 +1724,8 @@ lookup_symbol_from_sysv_hash_tab(const environment* env,
GElf_Sym symbol;
const char* sym_name_str;
size_t sym_size;
+ uint64_t sym_value;
+ uint32_t sym_shndx;
elf_symbol::type sym_type;
elf_symbol::binding sym_binding;
elf_symbol::visibility sym_visibility;
@@ -1743,6 +1745,8 @@ lookup_symbol_from_sysv_hash_tab(const environment* env,
sym_visibility =
stv_to_elf_symbol_visibility(GELF_ST_VISIBILITY(symbol.st_other));
sym_size = symbol.st_size;
+ sym_value = symbol.st_value;
+ sym_shndx = symbol.st_shndx;
elf_symbol::version ver;
if (get_version_for_symbol(elf_handle, symbol_index,
/*get_def_version=*/true, ver))
@@ -1751,6 +1755,8 @@ lookup_symbol_from_sysv_hash_tab(const environment* env,
elf_symbol::create(env,
symbol_index,
sym_size,
+ sym_value,
+ sym_shndx,
sym_name_str,
sym_type,
sym_binding,
@@ -2028,7 +2034,8 @@ lookup_symbol_from_gnu_hash_tab(const environment* env,
ABG_ASSERT(!ver.str().empty());
elf_symbol_sptr symbol_found =
- elf_symbol::create(env, i, symbol.st_size, sym_name_str,
+ elf_symbol::create(env, i, symbol.st_size, symbol.st_value,
+ symbol.st_shndx, sym_name_str,
sym_type, sym_binding,
symbol.st_shndx != SHN_UNDEF,
symbol.st_shndx == SHN_COMMON,
@@ -2181,6 +2188,7 @@ lookup_symbol_from_symtab(const environment* env,
ABG_ASSERT(!ver.str().empty());
elf_symbol_sptr symbol_found =
elf_symbol::create(env, i, sym->st_size,
+ sym->st_value, sym->st_shndx,
name_str, sym_type,
sym_binding, sym_is_defined,
sym_is_common, ver, sym_visibility);
@@ -3081,7 +3089,10 @@ public:
/// to symbols exported using the EXPORT_SYMBOL_GPL macro from the
/// linux kernel.
Elf_Scn* ksymtab_section_;
+ Elf_Scn* ksymtab_reloc_section_;
Elf_Scn* ksymtab_gpl_section_;
+ Elf_Scn* ksymtab_gpl_reloc_section_;
+ Elf_Scn* ksymtab_strings_section_;
Elf_Scn* versym_section_;
Elf_Scn* verdef_section_;
Elf_Scn* verneed_section_;
@@ -3277,7 +3288,10 @@ public:
nb_ksymtab_entries_ = 0;
nb_ksymtab_gpl_entries_ = 0;
ksymtab_section_ = 0;
+ ksymtab_reloc_section_ = 0;
ksymtab_gpl_section_ = 0;
+ ksymtab_gpl_reloc_section_ = 0;
+ ksymtab_strings_section_ = 0;
versym_section_ = 0;
verdef_section_ = 0;
verneed_section_ = 0;
@@ -6079,6 +6093,23 @@ public:
return ksymtab_section_;
}
+ /// Return the .rel{a,}__ksymtab section of a linux kernel ELF file (either
+ /// a vmlinux binary or a kernel module).
+ ///
+ /// @return the .rel{a,}__ksymtab section if found, nil otherwise.
+ Elf_Scn*
+ find_ksymtab_reloc_section() const
+ {
+ if (!ksymtab_reloc_section_)
+ {
+ Elf_Scn *sec = find_section(elf_handle(), ".rela__ksymtab", SHT_RELA);
+ if (!sec)
+ sec = find_section(elf_handle(), ".rel__ksymtab", SHT_REL);
+ const_cast<read_context*>(this)->ksymtab_reloc_section_ = sec;
+ }
+ return ksymtab_reloc_section_;
+ }
+
/// Return the __ksymtab_gpl section of a linux kernel ELF file
/// (either a vmlinux binary or a kernel module).
///
@@ -6092,6 +6123,36 @@ public:
return ksymtab_gpl_section_;
}
+ /// Return the .rel{a,}__ksymtab_gpl section of a linux kernel ELF file
+ /// (either a vmlinux binary or a kernel module).
+ ///
+ /// @return the .rel{a,}__ksymtab_gpl section if found, nil otherwise.
+ Elf_Scn*
+ find_ksymtab_gpl_reloc_section() const
+ {
+ if (!ksymtab_gpl_reloc_section_)
+ {
+ Elf_Scn *sec = find_section(elf_handle(), ".rela__ksymtab_gpl", SHT_RELA);
+ if (!sec)
+ sec = find_section(elf_handle(), ".rel__ksymtab_gpl", SHT_REL);
+ const_cast<read_context*>(this)->ksymtab_gpl_reloc_section_ = sec;
+ }
+ return ksymtab_gpl_reloc_section_;
+ }
+
+ /// Return the __ksymtab_strings section of a linux kernel ELF file
+ /// (either a vmlinux binary or a kernel module).
+ ///
+ /// @return the __ksymtab_strings section if found, nil otherwise.
+ Elf_Scn*
+ find_ksymtab_strings_section() const
+ {
+ if (!ksymtab_strings_section_)
+ const_cast<read_context*>(this)->ksymtab_strings_section_ =
+ find_section(elf_handle(), "__ksymtab_strings", SHT_PROGBITS);
+ return ksymtab_strings_section_;
+ }
+
We should find a way to deduplicate these functions. Not necessarily in
this patch, but in a followup. Most of them only differ in the section
name and whether they check for an alternative. Something generic like
struct SectionQuery{ string name, int type };
Elf_Scn* find_section(const vector<SectionQuery>&);
should be sufficient to forward from all the other variants to one
implementation.
/// Return either a __ksymtab or a __ksymtab_gpl section, in case
/// only the __ksymtab_gpl exists.
///
@@ -6106,6 +6167,19 @@ public:
return result;
}
+ /// Return either a .rel{a,}__ksymtab or a .rel{a,}__ksymtab_gpl section
+ ///
+ /// @return the .rel{a,}__ksymtab section if it exists, or the
+ /// .rel{a,}__ksymtab_gpl; or NULL if neither is found.
+ Elf_Scn*
+ find_any_ksymtab_reloc_section() const
+ {
+ Elf_Scn *result = find_ksymtab_reloc_section();
+ if (!result)
+ result = find_ksymtab_gpl_reloc_section();
+ return result;
+ }
+
/// Return the SHT_GNU_versym, SHT_GNU_verdef and SHT_GNU_verneed
/// sections that are involved in symbol versionning.
///
@@ -6283,7 +6357,8 @@ public:
stv_to_elf_symbol_visibility(GELF_ST_VISIBILITY(s->st_other));
elf_symbol_sptr sym =
- elf_symbol::create(env(), symbol_index, s->st_size, name_str,
+ elf_symbol::create(env(), symbol_index, s->st_size,
+ s->st_value, s->st_shndx, name_str,
stt_to_elf_symbol_type(GELF_ST_TYPE(s->st_info)),
stb_to_elf_symbol_binding(GELF_ST_BIND(s->st_info)),
sym_is_defined, sym_is_common, ver, vis);
@@ -7435,6 +7510,64 @@ public:
return symbol;
}
+ /// Determine the format of the __ksymtab and __ksymtab_gpl
+ /// sections of Linux kernel modules.
+ ///
+ /// This is important because we need the know the format of these
s/need the/need to/ ^^^
+ /// sections to be able to read from them.
+ ///
+ /// @return the format the __ksymtab[_gpl] sections.
+ enum ksymtab_format
+ get_ksymtab_format_module() const
+ {
+ Elf_Scn *section = find_any_ksymtab_reloc_section();
+
+ if (!section)
+ return UNDEFINED_KSYMTAB_FORMAT;
+
+ // Libdwfl has a weird quirk where, in the process of obtaining an Elf
+ // descriptor via dwfl_module_getelf(), it will apply all relocations it
+ // knows how to and it will zero a relocation after applying it. If the
will zero the relocation info after
+ // .rela__ksymtab* section contained only simple (absolute) relocations,
+ // they will have been all applied and sh_size will be 0. For arches that
+ // support relative ksymtabs, simple relocations only appear in pre-4.19
+ // kernel modules.
+ GElf_Shdr section_mem;
+ GElf_Shdr *section_shdr = gelf_getshdr(section, §ion_mem);
+ if (section_shdr->sh_size == 0)
+ return PRE_V4_19_KSYMTAB_FORMAT;
+ /* SHT_REL sections not supported yet */
+ if (section_shdr->sh_type == SHT_REL)
+ return UNDEFINED_KSYMTAB_FORMAT;
+
+ // If we still have a normal non-zeroed relocation section, we can guess
+ // what format the ksymtab is in depending on what types of relocs it
+ // contains.
+ GElf_Rela rela;
+ Elf_Data *rela_section_data = elf_rawdata(section, 0);
+ gelf_getrela(rela_section_data, 0, &rela);
+
+ // Sigh, I dislike the arch-dependent code here, but this seems to be a
+ // reliable heuristic for kernel modules for now. Relative ksymtabs only
+ // supported on x86 and arm64 as of v4.19.
+ ksymtab_format format;
+ switch (GELF_R_TYPE(rela.r_info))
+ {
+ case R_X86_64_64: // Same as R_386_32
add 'fallthrough' to the comments, else younger compilers will warn
about the potentially unintentional fallthrough.
+ case R_AARCH64_ABS64:
+ format = PRE_V4_19_KSYMTAB_FORMAT;
+ break;
+ case R_X86_64_PC32: // Same as R_386_PC32
+ case R_AARCH64_PREL32:
+ format = V4_19_KSYMTAB_FORMAT;
+ break;
+ default:
+ format = UNDEFINED_KSYMTAB_FORMAT;
We should probably emit a warning in this case. Are there more
architectures that have PREL32 enabled as of now that are not yet
covered here?