[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] dwarf-reader: add support for symbol namespaces in ksymtab entries



Hello,

So after the chat you and I had on IRC I have finally committed this
patch to master with a slight change.  Please read below for details.

[Matthias Maennich <maennich@google.com> a écrit:]

>
> [...]
>
>
>> +		if (try_reading_first_ksymtab_entry(
>> +			ksymtab, format == V4_19_KSYMTAB_FORMAT, entries))
>> +		  {
>

[Dodji Seketeli <dodji@seketeli.org> a écrit:]

> If some suppression specification has 'removed' the symbol that is designated
> by the first entry in the ksymtab section at 'entries' offset, then this
> will yield an empty elf_symbol_sptr intance, and the result of this "if"
> will be false, even though we could successfully read the first ksymtab
> entry there.
>
> This is because in try_reading_first_ksymtab_entry, we use
> lookup_elf_symbol_from_address(symbol_address) which looks up an elf
> symbol from the elf symbol map that we *built* by reading the .symtab
> symbol table.
>
> I think that if you are going to use try_reading_first_ksymtab_entry
> like this, then maybe it should rather use a new function that looks up
> symbols (by address) by walking the actual .symbol section, like what
> lookup_native_elf_symbol_from_index does.
>
> And this makes me think that this patch and its predecessor in the
> series should probably go in at the same time, once we resolve this
> discussion.  What do you think?

Actually, try_reading_first_ksymtab_entry is just factorizing the code
that is there already.

This problem of the first symbol that we try reading from the ksymtab
section being suppressed by a suppression specification was already
there in the pre-existing code that try_reading_first_ksymtab_entry is
factorizing out.  The thing is that the problem was not apparent back
then (before some recent patches of mine in master) because suppression
specifications could *NOT* drop ELF symbols at that time.

So I am committing this patch as is, and I'll come up with a generic
solution to address the issue regardless of this patch.

[...]

>> +	size_t nb_entries = 0;
>> +	if (kind == KERNEL_SYMBOL_TABLE_KIND_KSYMTAB)
>> +	  nb_entries = get_nb_ksymtab_entries();
>> +	else if (kind == KERNEL_SYMBOL_TABLE_KIND_KSYMTAB_GPL)
>> +	  nb_entries = get_nb_ksymtab_entries();
>
> I think here, what you want is:
>
>     nb_entries = get_nb_ksymtab_gpl_entries();

I have fixed this in the version of the patch that I am committing.

Please find the patch that was committed below.

Thank you for working on this.

Cheers,

>From 75637035282b2ff2907d4c560c9b242be9c6d244 Mon Sep 17 00:00:00 2001
From: Matthias Maennich <maennich@google.com>
Date: Mon, 21 Oct 2019 17:09:50 +0100
Subject: [PATCH] dwarf-reader: add support for symbol namespaces in ksymtab
 entries

Kernel v5.4 introduces Symbol Namespaces [1]. That changes the layout of
ksymtab entries in Kernel binaries. In particular, the kernel_symbol
entry gains a new member to represent the namespace. That change affects
binaries that have position relative relocations (we name that format
here V4_19_KSYMTAB_FORMAT) as well as those that don't
(PRE_V4_19_KSYMTAB_FORMAT). In any case there is an additional entry
that has the same size as the previous entries.

Since we iterate over the ksymtab entries to collect them, we need to
determine the correct size of these entries even though we do not
grab the namespace for ABI analysis purposes at this time.

In order to determine the size, we attempt to find the beginning of the
next entry by trying to read symbols with an increasing offset. Once we
succeed, we have the offset and therefore the size of one entry.

Since try_reading_first_ksymtab_entry() does already everything we need
to attempt to read a symbol from a beginning of a ksymtab, we only
needed to teach it to operate on an offset to read the potential second
entry.

'load_kernel_symbol_table' was determining the number of entries
unconditionally, even when we do have the unsupported case of a ksymtab
with relocations. Hence only load when needed.

	* src/abg-dwarf-reader.cc
	(read_context::try_reading_first_ksymtab_entry): Add
	symbol_offset parameter.
	(read_context::get_ksymtab_entry_size): Add support for variable
	size ksymtab entries due to symbol namespaces.
	(load_kernel_symbol_table): only load nb_entries when needed

[1] https://lore.kernel.org/lkml/20190906103235.197072-1-maennich@google.com/

Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
---
 src/abg-dwarf-reader.cc | 107 +++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 93 insertions(+), 14 deletions(-)

diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
index 278cb5c..6f8f5be 100644
--- a/src/abg-dwarf-reader.cc
+++ b/src/abg-dwarf-reader.cc
@@ -7599,12 +7599,23 @@ public:
   /// section header (i.e. the symbol position as we are reading the first
   /// symbol).
   ///
+  /// @param section the ksymtab section to consider.
+  ///
+  /// @param position_relative_relocations if true, then consider that
+  /// the section designated by @p section contains position-relative
+  /// relocated symbol addresses.
+  ///
+  ///@param symbol_offset if different from zero
+  /// If symbol_offset is != 0, adjust the position we consider the section
+  /// start. That is useful to read the ksymtab with a slight offset.
+  ///
   /// @return the symbol resulting from the lookup of the symbol address we
   /// got from reading the first entry of the ksymtab or null if no such entry
   /// could be found.
   elf_symbol_sptr
   try_reading_first_ksymtab_entry(Elf_Scn* section,
-				  bool position_relative_relocations) const
+				  bool position_relative_relocations,
+				  int  symbol_offset = 0) const
   {
     // this function does not support relocatable ksymtab entries (as for
     // example in kernel modules). Hence assert here on not having any
@@ -7624,6 +7635,9 @@ public:
     else
       symbol_value_size = architecture_word_size();
 
+    const int read_offset = (symbol_offset * symbol_value_size);
+    bytes += read_offset;
+
     if (position_relative_relocations)
       {
 	int32_t offset = 0;
@@ -7631,7 +7645,11 @@ public:
 						is_big_endian, offset));
 	GElf_Shdr section_header;
 	gelf_getshdr(section, &section_header);
-	symbol_address = offset + section_header.sh_addr;
+	// the actual symbol address is relative to its position. Since we do
+	// not know the position, we take the beginning of the section, add the
+	// read_offset that we might have and finally apply the offset we
+	// read from the section.
+	symbol_address = section_header.sh_addr + read_offset + offset;
       }
     else
       ABG_ASSERT(read_int_from_array_of_bytes(bytes, symbol_value_size,
@@ -7821,8 +7839,64 @@ public:
   get_ksymtab_entry_size() const
   {
     if (ksymtab_entry_size_ == 0)
-      // The entry size if 2 *  symbol_value_size.
-      ksymtab_entry_size_ = 2 * get_ksymtab_symbol_value_size();
+      {
+	const unsigned char symbol_size = get_ksymtab_symbol_value_size();
+	Elf_Scn*	    ksymtab = find_any_ksymtab_section();
+	if (ksymtab)
+	  {
+	    GElf_Shdr ksymtab_shdr;
+	    gelf_getshdr(ksymtab, &ksymtab_shdr);
+
+	    // ksymtab entries have the following layout
+	    //
+	    // struct {
+	    //  T symbol_address;  // .symtab entry
+	    //  T name_address;    // .strtab entry
+	    // }
+	    //
+	    // with T being a suitable type to represent the absolute,
+	    // relocatable or position relative value of the address. T's size
+	    // is determined by get_ksymtab_symbol_value_size().
+	    //
+	    // Since Kernel v5.4, the entries have the following layout
+	    //
+	    // struct {
+	    //  T symbol_address;  // .symtab entry
+	    //  T name_address;    // .strtab entry
+	    //  T namespace;       // .strtab entry
+	    // }
+	    //
+	    // To determine the ksymtab entry size, find the next entry that
+	    // refers to a valid .symtab entry. The offset to that one is what
+	    // we are searching for.
+	    for (unsigned entries = 2; entries <= 3; ++entries)
+	      {
+		const unsigned candidate_size = entries * symbol_size;
+
+		// if there is exactly one entry, section size == entry size
+		// (this looks like an optimization, but in fact it prevents
+		// from illegal reads if there is actually only one entry)
+		if (ksymtab_shdr.sh_size == candidate_size)
+		  {
+		    ksymtab_entry_size_ = candidate_size;
+		    break;
+		  }
+
+		// otherwise check whether the symbol following the candidate
+		// number of entries is a valid ELF symbol. For that we read
+		// the ksymtab with the given offset and if the symbol is
+		// valid, we found our entry size.
+		const ksymtab_format format = get_ksymtab_format();
+		if (try_reading_first_ksymtab_entry
+		    (ksymtab, format == V4_19_KSYMTAB_FORMAT, entries))
+		  {
+		    ksymtab_entry_size_ = candidate_size;
+		    break;
+		  }
+	      }
+	    ABG_ASSERT(ksymtab_entry_size_ != 0);
+	  }
+      }
 
     return ksymtab_entry_size_;
   }
@@ -8118,7 +8192,6 @@ public:
   bool
   load_kernel_symbol_table(kernel_symbol_table_kind kind)
   {
-    size_t nb_entries = 0;
     Elf_Scn *section = 0, *reloc_section = 0;
     address_set_sptr linux_exported_fns_set, linux_exported_vars_set;
 
@@ -8129,23 +8202,18 @@ public:
       case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB:
 	section = find_ksymtab_section();
 	reloc_section = find_ksymtab_reloc_section();
-	nb_entries = get_nb_ksymtab_entries();
 	linux_exported_fns_set = create_or_get_linux_exported_fn_syms();
 	linux_exported_vars_set = create_or_get_linux_exported_var_syms();
 	break;
       case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB_GPL:
 	section = find_ksymtab_gpl_section();
 	reloc_section = find_ksymtab_gpl_reloc_section();
-	nb_entries = get_nb_ksymtab_gpl_entries();
 	linux_exported_fns_set = create_or_get_linux_exported_gpl_fn_syms();
 	linux_exported_vars_set = create_or_get_linux_exported_gpl_var_syms();
 	break;
       }
 
-    if (!linux_exported_vars_set
-	|| !linux_exported_fns_set
-	|| !section
-	|| !nb_entries)
+    if (!linux_exported_vars_set || !linux_exported_fns_set || !section)
       return false;
 
     ksymtab_format format = get_ksymtab_format();
@@ -8157,9 +8225,20 @@ public:
     // apply those relocations and we're safe to read the relocation section to
     // determine which exported symbols are in the ksymtab.
     if (!reloc_section || format == PRE_V4_19_KSYMTAB_FORMAT)
-      return populate_symbol_map_from_ksymtab(section, linux_exported_fns_set,
-                                              linux_exported_vars_set,
-					      nb_entries);
+      {
+	size_t nb_entries = 0;
+	if (kind == KERNEL_SYMBOL_TABLE_KIND_KSYMTAB)
+	  nb_entries = get_nb_ksymtab_entries();
+	else if (kind == KERNEL_SYMBOL_TABLE_KIND_KSYMTAB_GPL)
+	  nb_entries = get_nb_ksymtab_gpl_entries();
+
+	if (!nb_entries)
+	  return false;
+
+	return populate_symbol_map_from_ksymtab(
+	    section, linux_exported_fns_set, linux_exported_vars_set,
+	    nb_entries);
+      }
     else
       return populate_symbol_map_from_ksymtab_reloc(reloc_section,
                                                     linux_exported_fns_set,
-- 
1.8.3.1

-- 
		Dodji