[PATCH 2/2] add Linux kernel symbol namespace support

Matthias Maennich maennich@google.com
Tue Mar 15 11:25:49 GMT 2022


On Mon, Mar 14, 2022 at 06:13:12PM +0000, Giuliano Procida wrote:
>Bug 28954 - add Linux Kernel symbol namespace support
>
>This commit adds Linux Kernel symbol namespace support roughly as
>discussed.
>
>Each symbol can be exported to a specified named namespace or left in
>the global (nameless) namespace. The latter is explicitly represented
>as the empty string, at least when it comes to the value of
>__kstrtabns_FOO symbols, but the common interpretation is that such
>symbols are not exported to a namespace.
>
>I would rather not make this assumption in more than one place in the
>code and would also prefer to protect against __kstrtabns_FOO for
>globally export FOO disappearing with a future change to the export
>macros.
>
>So the code here represents namespace as optional<string> and treats
>empty strings found in __ksymtab_strings as missing.
>
>	* include/abg-ir.h (elf_symbol::elf_symbol): Add ns argument.
>	(elf_symbol::create): Add ns argument.
>	(elf_symbol::get_namespace): Declare new function.
>	(elf_symbol::set_namespace): Declare new function.
>	and set_namespace.
>	* src/abg-comp-filter.cc (namespace_changed): Define new
>	helper functions.
>	(categorize_harmful_diff_node): Also call namespace_changed().
>	* src/abg-ir.cc (elf_symbol::priv): Add namespace_ member.
>	(elf_symbol::priv::priv): Add namespace_ to initialisers.
>	(elf_symbol::elf_symbol): Take new ns argument and pass it to
>	priv constructor.
>	(elf_symbol::create): Take new ns argument and pass it to
>	elf_symbol constructor.
>	(elf_symbol::get_namespace): Define new function.
>	(elf_symbol::set_namespace): Define new function.
>	* src/abg-reader.cc (build_elf_symbol): If ns attribute is
>	present, set symbol namespace.
>	* src/abg-reporter-priv.cc (maybe_report_diff_for_symbol): If
>	symbol namespaces differ, report this.
>	* src/abg-symtab-reader.cc (symtab::load): Try to get
>	__ksymtab_strings metadata and data. Use this to look up
>	__kstrtabns_FOO namespace entries. Set symbol namespaces where
>	found.
>	* src/abg-writer.cc (write_elf_symbol): Emit ns attribute, if
>	symbol has a namespace.
>
>Signed-off-by: Giuliano Procida <gprocida@google.com>

Thanks for working on that! I think this is conceptually the right thing
to do to allow compatibility with different kernel versions and
configurations. I just had some minor comments. With those addressed,
feel free to add:

Reviewed-by: Matthias Maennich <maennich@google.com>

>---
> include/abg-ir.h         |  9 ++++++++
> src/abg-comp-filter.cc   | 39 +++++++++++++++++++++++++++++++-
> src/abg-ir.cc            | 27 +++++++++++++++++++++-
> src/abg-reader.cc        |  7 ++++++
> src/abg-reporter-priv.cc | 12 ++++++++++
> src/abg-symtab-reader.cc | 49 ++++++++++++++++++++++++++++++++++++++++
> src/abg-writer.cc        |  4 ++++
> 7 files changed, 145 insertions(+), 2 deletions(-)
>
>diff --git a/include/abg-ir.h b/include/abg-ir.h
>index a2f4e1a7..7c42bea7 100644
>--- a/include/abg-ir.h
>+++ b/include/abg-ir.h
>@@ -21,6 +21,7 @@
> #include <functional>
> #include <set>
> #include <unordered_map>
>+#include "abg-cxx-compat.h"
> #include "abg-fwd.h"
> #include "abg-hash.h"
> #include "abg-traverse.h"
>@@ -921,6 +922,7 @@ private:
> 	     visibility		vi,
> 	     bool		is_in_ksymtab = false,
> 	     uint64_t		crc = 0,
>+	     const abg_compat::optional<std::string>&	ns = {},

When implementing this in the kernel itself, I also got trapped in
naming inconsistencies, ns ... NS ... namespace. I believe I went with
consistently calling it 'namespace'. I suggest to not abbreviate with
'ns'.

> 	     bool		is_suppressed = false);
>
>   elf_symbol(const elf_symbol&);
>@@ -946,6 +948,7 @@ public:
> 	 visibility	    vi,
> 	 bool		    is_in_ksymtab = false,
> 	 uint64_t	    crc = 0,
>+	 const abg_compat::optional<std::string>&	ns = {},
> 	 bool		    is_suppressed = false);
>
>   const environment*
>@@ -1023,6 +1026,12 @@ public:
>   void
>   set_crc(uint64_t crc);
>
>+  const abg_compat::optional<std::string>&
>+  get_namespace() const;
>+
>+  void
>+  set_namespace(const abg_compat::optional<std::string>& ns);
>+
>   bool
>   is_suppressed() const;
>
>diff --git a/src/abg-comp-filter.cc b/src/abg-comp-filter.cc
>index 56251274..6ee31e35 100644
>--- a/src/abg-comp-filter.cc
>+++ b/src/abg-comp-filter.cc
>@@ -254,6 +254,42 @@ crc_changed(const diff* diff)
>   return false;
> }
>
>+/// Test if there was a function or variable namespace change.
>+///
>+/// @param f the first function or variable to consider.
>+///
>+/// @param s the second function or variable to consider.
>+///
>+/// @return true if the test is positive, false otherwise.
>+template <typename function_or_var_decl_sptr>
>+static bool
>+namespace_changed(const function_or_var_decl_sptr& f,
>+		  const function_or_var_decl_sptr& s)
>+{
>+  const auto symbol_f  = f->get_symbol(), symbol_s = s->get_symbol();

const auto& to avoid some refcounts on the symbol_sptr.

>+  if (!symbol_f || !symbol_s)
>+    return false;
>+  return symbol_f->get_namespace() != symbol_s->get_namespace();
>+}
>+
>+/// test if the current diff tree node carries a namespace change in
       ^
       Test
>+/// either a function or a variable.
>+///
>+/// @param diff the diff tree node to consider.
>+///
>+/// @return true if the test is positive, false otherwise.
>+static bool
>+namespace_changed(const diff* diff)
>+{
>+  if (const function_decl_diff* d =
>+	dynamic_cast<const function_decl_diff*>(diff))
>+    return namespace_changed(d->first_function_decl(),
>+			     d->second_function_decl());
>+  if (const var_diff* d = dynamic_cast<const var_diff*>(diff))
>+    return namespace_changed(d->first_var(), d->second_var());
>+  return false;
>+}
>+
> /// Test if there was a function name change, but there there was no
> /// change in name of the underlying symbol.  IOW, if the name of a
> /// function changed, but the symbol of the new function is equal to
>@@ -1781,7 +1817,8 @@ categorize_harmful_diff_node(diff *d, bool pre)
> 	      || non_static_data_member_added_or_removed(d)
> 	      || base_classes_added_or_removed(d)
> 	      || has_harmful_enum_change(d)
>-	      || crc_changed(d)))
>+	      || crc_changed(d)
>+	      || namespace_changed(d)))
> 	category |= SIZE_OR_OFFSET_CHANGE_CATEGORY;
>
>       if (has_virtual_mem_fn_change(d))
>diff --git a/src/abg-ir.cc b/src/abg-ir.cc
>index 0ef5e8b2..c510bc55 100644
>--- a/src/abg-ir.cc
>+++ b/src/abg-ir.cc
>@@ -1728,6 +1728,7 @@ struct elf_symbol::priv
>   bool			is_common_;
>   bool			is_in_ksymtab_;
>   uint64_t		crc_;
>+  abg_compat::optional<std::string>	namespace_;
>   bool			is_suppressed_;
>   elf_symbol_wptr	main_symbol_;
>   elf_symbol_wptr	next_alias_;
>@@ -1745,6 +1746,7 @@ struct elf_symbol::priv
>       is_common_(false),
>       is_in_ksymtab_(false),
>       crc_(0),
>+      namespace_(),
>       is_suppressed_(false)
>   {}
>
>@@ -1760,6 +1762,7 @@ struct elf_symbol::priv
>        elf_symbol::visibility	  vi,
>        bool			  is_in_ksymtab,
>        uint64_t			  crc,
>+       const abg_compat::optional<std::string>&	ns,
>        bool			  is_suppressed)
>     : env_(e),
>       index_(i),
>@@ -1773,6 +1776,7 @@ struct elf_symbol::priv
>       is_common_(c),
>       is_in_ksymtab_(is_in_ksymtab),
>       crc_(crc),
>+      namespace_(ns),
>       is_suppressed_(is_suppressed)
>   {
>     if (!is_common_)
>@@ -1818,6 +1822,8 @@ elf_symbol::elf_symbol()
> /// @param vi the visibility of the symbol.
> ///
> /// @param crc the CRC (modversions) value of Linux Kernel symbols
>+///
>+/// @param ns the namespace of Linux Kernel symbols, if any
> elf_symbol::elf_symbol(const environment* e,
> 		       size_t		  i,
> 		       size_t		  s,
>@@ -1830,6 +1836,7 @@ elf_symbol::elf_symbol(const environment* e,
> 		       visibility	  vi,
> 		       bool		  is_in_ksymtab,
> 		       uint64_t		  crc,
>+		       const abg_compat::optional<std::string>&	ns,
> 		       bool		  is_suppressed)
>   : priv_(new priv(e,
> 		   i,
>@@ -1843,6 +1850,7 @@ elf_symbol::elf_symbol(const environment* e,
> 		   vi,
> 		   is_in_ksymtab,
> 		   crc,
>+		   ns,
> 		   is_suppressed))
> {}
>
>@@ -1886,6 +1894,8 @@ elf_symbol::create()
> ///
> /// @param crc the CRC (modversions) value of Linux Kernel symbols
> ///
>+/// @param ns the namespace of Linux Kernel symbols, if any
>+///
> /// @return a (smart) pointer to a newly created instance of @ref
> /// elf_symbol.
> elf_symbol_sptr
>@@ -1901,10 +1911,11 @@ elf_symbol::create(const environment* e,
> 		   visibility	      vi,
> 		   bool		      is_in_ksymtab,
> 		   uint64_t	      crc,
>+		   const abg_compat::optional<std::string>&	ns,
> 		   bool		      is_suppressed)
> {
>   elf_symbol_sptr sym(new elf_symbol(e, i, s, n, t, b, d, c, ve, vi,
>-				     is_in_ksymtab, crc, is_suppressed));
>+				     is_in_ksymtab, crc, ns, is_suppressed));
>   sym->priv_->main_symbol_ = sym;
>   return sym;
> }
>@@ -2147,6 +2158,20 @@ void
> elf_symbol::set_crc(uint64_t crc)
> {priv_->crc_ = crc;}
>
>+/// Getter of the 'namespace' property.
>+///
>+/// @return the namespace for Linux Kernel symbols, if any
>+const abg_compat::optional<std::string>&
>+elf_symbol::get_namespace() const
>+{return priv_->namespace_;}
>+
>+/// Setter of the 'namespace' property.
>+///
>+/// @param ns the new namespace for Linux Kernel symbols, if any
>+void
>+elf_symbol::set_namespace(const abg_compat::optional<std::string>& ns)

While I like the std::optional in the data structure, I am not sure I
like it exposed to the interface. Is there ever a case where
set_namespace is called with an unset optional? And if so, would that
not be a better interface to offer ::unset_namespace ?

>+{priv_->namespace_ = ns;}
>+
> /// Getter for the 'is-suppressed' property.
> ///
> /// @return true iff the current symbol has been suppressed by a
>diff --git a/src/abg-reader.cc b/src/abg-reader.cc
>index 31885692..55b7348d 100644
>--- a/src/abg-reader.cc
>+++ b/src/abg-reader.cc
>@@ -3269,6 +3269,13 @@ build_elf_symbol(read_context& ctxt, const xmlNodePtr node,
>   if (crc != 0)
>     e->set_crc(crc);
>
>+  if (xml_char_sptr s = XML_NODE_GET_ATTRIBUTE(node, "ns"))

In particular, I would call the attribute 'namespace' in the XML unless
we want to save bytes here.

>+    {
>+      std::string ns;
>+      xml::xml_char_sptr_to_string(s, ns);
>+      e->set_namespace({ns});
>+    }
>+
>   return e;
> }
>
>diff --git a/src/abg-reporter-priv.cc b/src/abg-reporter-priv.cc
>index 7012f5dc..dd40690c 100644
>--- a/src/abg-reporter-priv.cc
>+++ b/src/abg-reporter-priv.cc
>@@ -1158,6 +1158,18 @@ maybe_report_diff_for_symbol(const elf_symbol_sptr&	symbol1,
> 	  << std::noshowbase << std::dec
> 	  << "\n";
>     }
>+
>+  const abg_compat::optional<std::string>& ns1 = symbol1->get_namespace();
>+  const abg_compat::optional<std::string>& ns2 = symbol2->get_namespace();
>+  if (ns1 != ns2)
>+    {
>+      const std::string none = "(none)";
>+      out << indent << "namespace changed from "
>+	  << (ns1.has_value() ? ns1.value() : none)
>+	  << " to "
>+	  << (ns2.has_value() ? ns2.value() : none)
>+	  << "\n";
>+    }
> }
>
> /// For a given symbol, emit a string made of its name and version.
>diff --git a/src/abg-symtab-reader.cc b/src/abg-symtab-reader.cc
>index b42ce87d..589f3703 100644
>--- a/src/abg-symtab-reader.cc
>+++ b/src/abg-symtab-reader.cc
>@@ -232,9 +232,33 @@ symtab::load_(Elf*	       elf_handle,
>       return false;
>     }
>
>+  // The __kstrtab_strings sections is basically an ELF strtab but does not
>+  // support elf_strptr lookups. A single call to elf_getdata gives a handle to
>+  // section data.
>+  //
>+  // The value of a __kstrtabns_FOO (or other similar) symbol is an address
>+  // within the __kstrtab_strings section. To look up the string value, we need
>+  // to translate from load address to section offset by subtracting the base
>+  // address of the section.
>+  Elf_Scn* strings_section = elf_helpers::find_ksymtab_strings_section(elf_handle);
>+  size_t strings_offset = 0;
>+  const char* strings_data = nullptr;
>+  size_t strings_size = 0;
>+  if (strings_section)
>+    {
>+      GElf_Shdr strings_sheader;
>+      gelf_getshdr(strings_section, &strings_sheader);
>+      strings_offset = strings_sheader.sh_addr;
>+      Elf_Data* data = elf_getdata(strings_section, nullptr);
>+      ABG_ASSERT(data->d_off == 0);
>+      strings_data = reinterpret_cast<const char *>(data->d_buf);
>+      strings_size = data->d_size;
>+    }
>+
>   const bool is_kernel = elf_helpers::is_linux_kernel(elf_handle);
>   std::unordered_set<std::string> exported_kernel_symbols;
>   std::unordered_map<std::string, uint64_t> crc_values;
>+  std::unordered_map<std::string, std::string> namespaces;
>
>   const bool is_arm32 = elf_helpers::architecture_is_arm32(elf_handle);
>   const bool is_ppc64 = elf_helpers::architecture_is_ppc64(elf_handle);
>@@ -288,6 +312,20 @@ symtab::load_(Elf*	       elf_handle,
> 	  ABG_ASSERT(crc_values.emplace(name.substr(6), sym->st_value).second);
> 	  continue;
> 	}
>+      if (strings_section && is_kernel && name.rfind("__kstrtabns_", 0) == 0)
>+	{
>+	  // This symbol lives in the __ksymtab_strings section but st_value is
>+	  // a load address so we need to subtract an offset before looking it
>+	  // up in that section.
>+	  const size_t offset = sym->st_value - strings_offset;
>+	  if (offset < strings_size)
>+	    {
>+	      const char* ns = strings_data + offset;
>+	      if (*ns)
>+		ABG_ASSERT(namespaces.emplace(name.substr(12), ns).second);

    constexpr auto name_offset = sizeof("__kstrtabns_");
    ABG_ASSERT(namespaces.emplace(name.substr(name_offset), ns).second);

Also, consider limiting the string length to the section end to not
accidentally read beyond.

Cheers,
Matthias


>+	    }
>+	  continue;
>+	}
>
>       // filter out uninteresting entries and only keep functions/variables for
>       // now. The rest might be interesting in the future though.
>@@ -402,6 +440,17 @@ symtab::load_(Elf*	       elf_handle,
> 	symbol->set_crc(crc_entry.second);
>     }
>
>+  // Now add the namespaces
>+  for (const auto& namespace_entry : namespaces)
>+    {
>+      const auto r = name_symbol_map_.find(namespace_entry.first);
>+      if (r == name_symbol_map_.end())
>+	continue;
>+
>+      for (const auto& symbol : r->second)
>+	symbol->set_namespace({namespace_entry.second});
>+    }
>+
>   // sort the symbols for deterministic output
>   std::sort(symbols_.begin(), symbols_.end(), symbol_sort);
>
>diff --git a/src/abg-writer.cc b/src/abg-writer.cc
>index 7802128d..4dfa72a6 100644
>--- a/src/abg-writer.cc
>+++ b/src/abg-writer.cc
>@@ -3136,6 +3136,10 @@ write_elf_symbol(const elf_symbol_sptr&	sym,
>       << std::hex << std::showbase << sym->get_crc() << "'"
>       << std::dec << std::noshowbase;
>
>+  if (sym->get_namespace().has_value())
>+    o << " ns='"
>+      << sym->get_namespace().value() << "'";
>+
>   o << "/>\n";
>
>   return true;
>-- 
>2.35.1.723.g4982287a31-goog
>


More information about the Libabigail mailing list