[PATCH v4] Detect ld.so and libc.so version inconsistency during startup

Carlos O'Donell carlos@redhat.com
Wed Aug 24 15:15:38 GMT 2022


On 8/24/22 07:09, Florian Weimer wrote:
> The files NEWS, include/link.h, and sysdeps/generic/ldsodefs.h
> contribute to the version fingerprint used for detection.  The
> fingerprint can be further refined using the --with-extra-version-id
> configure argument.
> 
> _dl_call_libc_early_init is replaced with _dl_lookup_libc_early_init.
> The new function is used store a pointer to libc.so's
> __libc_early_init function in the libc_map_early_init member of the
> ld.so namespace structure.  This function pointer can then be called
> directly, so the separate invocation function is no longer needed.
> 
> The versioned symbol lookup needs the symbol versioning data
> structures, so the initialization of libc_map and libc_map_early_init
> is now done from _dl_check_map_versions, after this information
> becomes available.  (_dl_map_object_from_fd does not set this up
> in time, so the initialization code had to be moved from there.)
> This means that the separate initialization code can be removed from
> dl_main because _dl_check_map_versions covers all maps, including
> the initial executable loaded by the kernel.  The lookup still happens
> before relocation and the invocation of IFUNC resolvers, so IFUNC
> resolvers are protected from ABI mismatch.
> 
> The __libc_early_init function pointer is not protected because
> so little code runs between the pointer write and the invocation
> (only dynamic linker code and IFUNC resolvers).

LGTM.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

> ---
> v4: Drop build-time dependency on glibcelf for now.  I will try to post
>     a reworked glibcelf later this week.  Then we can eliminate the code
>     duplication.
> 
>  INSTALL                                            | 10 +++
>  Makerules                                          | 14 ++++
>  NEWS                                               |  7 +-
>  config.make.in                                     |  1 +
>  configure                                          | 11 +++
>  configure.ac                                       |  5 ++
>  elf/Makefile                                       |  2 +-
>  elf/Versions                                       |  4 +-
>  elf/dl-load.c                                      |  9 ---
>  ...bc-early-init.c => dl-lookup_libc_early_init.c} | 23 +++---
>  elf/dl-open.c                                      |  4 +-
>  elf/dl-version.c                                   | 18 +++++
>  elf/libc-early-init.h                              | 21 +++--
>  elf/rtld.c                                         | 12 +--
>  manual/install.texi                                |  9 +++
>  scripts/libc_early_init_name.py                    | 89 ++++++++++++++++++++++
>  sysdeps/generic/ldsodefs.h                         |  4 +
>  17 files changed, 198 insertions(+), 45 deletions(-)
> 
> diff --git a/INSTALL b/INSTALL
> index 659f75a97f..6470cd3d25 100644
> --- a/INSTALL
> +++ b/INSTALL
> @@ -120,6 +120,16 @@ if 'CFLAGS' is specified it must enable optimization.  For example:
>       compiler flags which target a later instruction set architecture
>       (ISA).
>  
> +'--with-extra-version-id=STRING'
> +     Use STRING as part of the fingerprint that is used by the dynamic
> +     linker to detect an incompatible version of 'libc.so'.  For
> +     example, STRING could be the full package version and release
> +     string used by a distribution build of the GNU C Library.  This
> +     way, concurrent process creation during a package update will fail
> +     with an error message, _error while loading shared libraries:
> +     /lib64/libc.so.6: ld.so/libc.so mismatch detected (upgrade in
> +     progress?)_, rather than crashing mysteriously.
> +
>  '--with-timeoutfactor=NUM'
>       Specify an integer NUM to scale the timeout of test programs.  This
>       factor can be changed at run time using 'TIMEOUTFACTOR' environment
> diff --git a/Makerules b/Makerules
> index d1e139d03c..756c1f181c 100644
> --- a/Makerules
> +++ b/Makerules
> @@ -112,6 +112,20 @@ before-compile := $(common-objpfx)first-versions.h \
>  		  $(common-objpfx)ldbl-compat-choose.h $(before-compile)
>  $(common-objpfx)first-versions.h: $(common-objpfx)versions.stmp
>  $(common-objpfx)ldbl-compat-choose.h: $(common-objpfx)versions.stmp
> +
> +# libc_early_init_name.h provides the actual name of the
> +# __libc_early_init function.  It is used as a protocol version marker
> +# between ld.so and libc.so
> +before-compile := $(common-objpfx)libc_early_init_name.h $(before-compile)
> +libc_early_init_name-deps = \
> +  $(..)NEWS $(..)sysdeps/generic/ldsodefs.h $(..)include/link.h
> +$(common-objpfx)libc_early_init_name.h: $(..)scripts/libc_early_init_name.py \
> +  $(common-objpfx)config.make $(libc_early_init_name-deps)
> +	$(PYTHON) $(..)scripts/libc_early_init_name.py \
> +	  --output=$@T \
> +	  --extra-version-id="$(extra-version-id)" \
> +	  $(libc_early_init_name-deps)
> +	$(move-if-change) $@T $@
>  endif # avoid-generated
>  endif # $(build-shared) = yes
>  
> diff --git a/NEWS b/NEWS
> index f9bef48a8f..9d3c8c5ed8 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -9,7 +9,12 @@ Version 2.37
>  
>  Major new features:
>  
> -  [Add new features here]
> +* The dynamic loader now prints an error message, "ld.so/libc.so
> +  mismatch detected (upgrade in progress?)" if it detects that the
> +  version of libc.so it loaded comes from a different build of glibc.
> +  The new configure option --with-extra-version-id can be used to
> +  specify an arbitrary string that affects the computation of the
> +  version fingerprint.
>  
>  Deprecated and removed features, and other changes affecting compatibility:
>  
> diff --git a/config.make.in b/config.make.in
> index d7c416cbea..ecaffbfd4b 100644
> --- a/config.make.in
> +++ b/config.make.in
> @@ -98,6 +98,7 @@ build-hardcoded-path-in-tests= @hardcoded_path_in_tests@
>  build-pt-chown = @build_pt_chown@
>  have-tunables = @have_tunables@
>  pthread-in-libc = @pthread_in_libc@
> +extra-version-id = @extra_version_id@
>  
>  # Build tools.
>  CC = @CC@
> diff --git a/configure b/configure
> index ff2c406b3b..c576f9f133 100755
> --- a/configure
> +++ b/configure
> @@ -760,6 +760,7 @@ with_headers
>  with_default_link
>  with_nonshared_cflags
>  with_rtld_early_cflags
> +with_extra_version_id
>  with_timeoutfactor
>  enable_sanity_checks
>  enable_shared
> @@ -1481,6 +1482,9 @@ Optional Packages:
>                            build nonshared libraries with additional CFLAGS
>    --with-rtld-early-cflags=CFLAGS
>                            build early initialization with additional CFLAGS
> +  --extra-version-id=STRING
> +                          specify an extra version string to use in internal
> +                          ABI checks
>    --with-timeoutfactor=NUM
>                            specify an integer to scale the timeout
>    --with-cpu=CPU          select code for CPU variant
> @@ -3397,6 +3401,13 @@ fi
>  
>  
>  
> +# Check whether --with-extra-version-id was given.
> +if test "${with_extra_version_id+set}" = set; then :
> +  withval=$with_extra_version_id; extra_version_id="$withval"
> +fi
> +
> +
> +
>  # Check whether --with-timeoutfactor was given.
>  if test "${with_timeoutfactor+set}" = set; then :
>    withval=$with_timeoutfactor; timeoutfactor=$withval
> diff --git a/configure.ac b/configure.ac
> index eb5bc6a131..68baeee4d7 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -169,6 +169,11 @@ AC_ARG_WITH([rtld-early-cflags],
>  	    [rtld_early_cflags=])
>  AC_SUBST(rtld_early_cflags)
>  
> +AC_ARG_WITH([extra-version-id],
> +	    AS_HELP_STRING([--extra-version-id=STRING],
> +			   [specify an extra version string to use in internal ABI checks]),
> +	    [extra_version_id="$withval"])
> +
>  AC_ARG_WITH([timeoutfactor],
>  	    AS_HELP_STRING([--with-timeoutfactor=NUM],
>  			   [specify an integer to scale the timeout]),
> diff --git a/elf/Makefile b/elf/Makefile
> index 3928a08787..bc68150a37 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -52,7 +52,6 @@ routines = \
>  # The core dynamic linking functions are in libc for the static and
>  # profiled libraries.
>  dl-routines = \
> -  dl-call-libc-early-init \
>    dl-close \
>    dl-debug \
>    dl-debug-symbols \
> @@ -65,6 +64,7 @@ dl-routines = \
>    dl-load \
>    dl-lookup \
>    dl-lookup-direct \
> +  dl-lookup_libc_early_init \
>    dl-minimal-malloc \
>    dl-misc \
>    dl-object \
> diff --git a/elf/Versions b/elf/Versions
> index a9ff278de7..6260c0fe03 100644
> --- a/elf/Versions
> +++ b/elf/Versions
> @@ -29,8 +29,8 @@ libc {
>      __placeholder_only_for_empty_version_map;
>    }
>    GLIBC_PRIVATE {
> -    # functions used in other libraries
> -    __libc_early_init;
> +    # A pattern is needed here because the suffix is dynamically generated.
> +    __libc_early_init_*;
>  
>      # Internal error handling support.  Interposes the functions in ld.so.
>      _dl_signal_exception; _dl_catch_exception;
> diff --git a/elf/dl-load.c b/elf/dl-load.c
> index 1ad0868dad..00e08b5500 100644
> --- a/elf/dl-load.c
> +++ b/elf/dl-load.c
> @@ -31,7 +31,6 @@
>  #include <sys/param.h>
>  #include <sys/stat.h>
>  #include <sys/types.h>
> -#include <gnu/lib-names.h>
>  
>  /* Type for the buffer we put the ELF header and hopefully the program
>     header.  This buffer does not really have to be too large.  In most
> @@ -1466,14 +1465,6 @@ cannot enable executable stack as shared object requires");
>      add_name_to_object (l, ((const char *) D_PTR (l, l_info[DT_STRTAB])
>  			    + l->l_info[DT_SONAME]->d_un.d_val));
>  
> -  /* If we have newly loaded libc.so, update the namespace
> -     description.  */
> -  if (GL(dl_ns)[nsid].libc_map == NULL
> -      && l->l_info[DT_SONAME] != NULL
> -      && strcmp (((const char *) D_PTR (l, l_info[DT_STRTAB])
> -		  + l->l_info[DT_SONAME]->d_un.d_val), LIBC_SO) == 0)
> -    GL(dl_ns)[nsid].libc_map = l;
> -
>    /* _dl_close can only eventually undo the module ID assignment (via
>       remove_slotinfo) if this function returns a pointer to a link
>       map.  Therefore, delay this step until all possibilities for
> diff --git a/elf/dl-call-libc-early-init.c b/elf/dl-lookup_libc_early_init.c
> similarity index 66%
> rename from elf/dl-call-libc-early-init.c
> rename to elf/dl-lookup_libc_early_init.c
> index ee9860e3ab..64bc287a05 100644
> --- a/elf/dl-call-libc-early-init.c
> +++ b/elf/dl-lookup_libc_early_init.c
> @@ -1,4 +1,4 @@
> -/* Invoke the early initialization function in libc.so.
> +/* Find the address of the __libc_early_init function.
>     Copyright (C) 2020-2022 Free Software Foundation, Inc.
>     This file is part of the GNU C Library.
>  
> @@ -16,26 +16,21 @@
>     License along with the GNU C Library; if not, see
>     <https://www.gnu.org/licenses/>.  */
>  
> -#include <assert.h>
>  #include <ldsodefs.h>
>  #include <libc-early-init.h>
>  #include <link.h>
>  #include <stddef.h>
>  
> -void
> -_dl_call_libc_early_init (struct link_map *libc_map, _Bool initial)
> +__typeof (__libc_early_init) *
> +_dl_lookup_libc_early_init (struct link_map *libc_map)
>  {
> -  /* There is nothing to do if we did not actually load libc.so.  */
> -  if (libc_map == NULL)
> -    return;
> -
>    const ElfW(Sym) *sym
> -    = _dl_lookup_direct (libc_map, "__libc_early_init",
> -                         0x069682ac, /* dl_new_hash output.  */
> +    = _dl_lookup_direct (libc_map, LIBC_EARLY_INIT_NAME_STRING,
> +                         LIBC_EARLY_INIT_GNU_HASH,
>                           "GLIBC_PRIVATE",
>                           0x0963cf85); /* _dl_elf_hash output.  */
> -  assert (sym != NULL);
> -  __typeof (__libc_early_init) *early_init
> -    = DL_SYMBOL_ADDRESS (libc_map, sym);
> -  early_init (initial);
> +  if (sym == NULL)
> +    _dl_signal_error (0, libc_map->l_name, NULL, "\
> +ld.so/libc.so mismatch detected (upgrade in progress?)");
> +  return DL_SYMBOL_ADDRESS (libc_map, sym);
>  }
> diff --git a/elf/dl-open.c b/elf/dl-open.c
> index a23e65926b..dcc24130fe 100644
> --- a/elf/dl-open.c
> +++ b/elf/dl-open.c
> @@ -760,8 +760,8 @@ dl_open_worker_begin (void *a)
>    if (!args->libc_already_loaded)
>      {
>        /* dlopen cannot be used to load an initial libc by design.  */
> -      struct link_map *libc_map = GL(dl_ns)[args->nsid].libc_map;
> -      _dl_call_libc_early_init (libc_map, false);
> +      if (GL(dl_ns)[args->nsid].libc_map != NULL)
> +	GL(dl_ns)[args->nsid].libc_map_early_init (false);
>      }
>  
>    args->worker_continue = true;
> diff --git a/elf/dl-version.c b/elf/dl-version.c
> index cda0889209..d9ec44eed6 100644
> --- a/elf/dl-version.c
> +++ b/elf/dl-version.c
> @@ -23,6 +23,8 @@
>  #include <string.h>
>  #include <ldsodefs.h>
>  #include <_itoa.h>
> +#include <gnu/lib-names.h>
> +#include <libc-early-init.h>
>  
>  #include <assert.h>
>  
> @@ -359,6 +361,22 @@ _dl_check_map_versions (struct link_map *map, int verbose, int trace_mode)
>  	}
>      }
>  
> +  /* Detect a libc.so loaded into this namespace.  The
> +     __libc_early_init lookup below means that we have to do this
> +     after parsing the version data.  */
> +  if (GL(dl_ns)[map->l_ns].libc_map == NULL
> +      && map->l_info[DT_SONAME] != NULL
> +      && strcmp (((const char *) D_PTR (map, l_info[DT_STRTAB])
> +		  + map->l_info[DT_SONAME]->d_un.d_val), LIBC_SO) == 0)
> +    {
> +      /* Look up this symbol early to trigger a mismatch error before
> +	 relocation (which may call IFUNC resolvers, and those can
> +	 have an internal ABI dependency).  */
> +      GL(dl_ns)[map->l_ns].libc_map_early_init
> +	= _dl_lookup_libc_early_init (map);
> +      GL(dl_ns)[map->l_ns].libc_map = map;
> +    }
> +
>    /* When there is a DT_VERNEED entry with libc.so on DT_NEEDED, issue
>       an error if there is a DT_RELR entry without GLIBC_ABI_DT_RELR
>       dependency.  */
> diff --git a/elf/libc-early-init.h b/elf/libc-early-init.h
> index a8edfadfb0..ac8c204bc7 100644
> --- a/elf/libc-early-init.h
> +++ b/elf/libc-early-init.h
> @@ -19,13 +19,10 @@
>  #ifndef _LIBC_EARLY_INIT_H
>  #define _LIBC_EARLY_INIT_H
>  
> +#include <libc_early_init_name.h>
> +
>  struct link_map;
>  
> -/* If LIBC_MAP is not NULL, look up the __libc_early_init symbol in it
> -   and call this function, with INITIAL as the argument.  */
> -void _dl_call_libc_early_init (struct link_map *libc_map, _Bool initial)
> -  attribute_hidden;
> -
>  /* In the shared case, this function is defined in libc.so and invoked
>     from ld.so (or on the fist static dlopen) after complete relocation
>     of a new loaded libc.so, but before user-defined ELF constructors
> @@ -33,6 +30,18 @@ void _dl_call_libc_early_init (struct link_map *libc_map, _Bool initial)
>     startup code.  If INITIAL is true, the libc being initialized is
>     the libc for the main program.  INITIAL is false for libcs loaded
>     for audit modules, dlmopen, and static dlopen.  */
> -void __libc_early_init (_Bool initial);
> +void __libc_early_init (_Bool initial)
> +#ifdef SHARED
> +/* Redirect to the actual implementation name.  */
> +  __asm__ (LIBC_EARLY_INIT_NAME_STRING)
> +#endif
> +  ;
> +
> +/* Attempts to find the appropriately named __libc_early_init function
> +   in LIBC_MAP.  On lookup failure, an exception is signaled,
> +   indicating an ld.so/libc.so mismatch.  */
> +__typeof (__libc_early_init) *_dl_lookup_libc_early_init (struct link_map *
> +                                                          libc_map)
> +  attribute_hidden;
>  
>  #endif /* _LIBC_EARLY_INIT_H */
> diff --git a/elf/rtld.c b/elf/rtld.c
> index cbbaf4a331..910075c37f 100644
> --- a/elf/rtld.c
> +++ b/elf/rtld.c
> @@ -1707,15 +1707,6 @@ dl_main (const ElfW(Phdr) *phdr,
>        /* Extract the contents of the dynamic section for easy access.  */
>        elf_get_dynamic_info (main_map, false, false);
>  
> -      /* If the main map is libc.so, update the base namespace to
> -	 refer to this map.  If libc.so is loaded later, this happens
> -	 in _dl_map_object_from_fd.  */
> -      if (main_map->l_info[DT_SONAME] != NULL
> -	  && (strcmp (((const char *) D_PTR (main_map, l_info[DT_STRTAB])
> -		      + main_map->l_info[DT_SONAME]->d_un.d_val), LIBC_SO)
> -	      == 0))
> -	GL(dl_ns)[LM_ID_BASE].libc_map = main_map;
> -
>        /* Set up our cache of pointers into the hash table.  */
>        _dl_setup_hash (main_map);
>      }
> @@ -2386,7 +2377,8 @@ dl_main (const ElfW(Phdr) *phdr,
>    /* Relocation is complete.  Perform early libc initialization.  This
>       is the initial libc, even if audit modules have been loaded with
>       other libcs.  */
> -  _dl_call_libc_early_init (GL(dl_ns)[LM_ID_BASE].libc_map, true);
> +  if (GL(dl_ns)[LM_ID_BASE].libc_map != NULL)
> +    GL(dl_ns)[LM_ID_BASE].libc_map_early_init (true);
>  
>    /* Do any necessary cleanups for the startup OS interface code.
>       We do these now so that no calls are made after rtld re-relocation
> diff --git a/manual/install.texi b/manual/install.texi
> index c775005581..6d43599a47 100644
> --- a/manual/install.texi
> +++ b/manual/install.texi
> @@ -144,6 +144,15 @@ dynamic linker diagnostics to run on CPUs which are not compatible with
>  the rest of @theglibc{}, for example, due to compiler flags which target
>  a later instruction set architecture (ISA).
>  
> +@item --with-extra-version-id=@var{string}
> +Use @var{string} as part of the fingerprint that is used by the dynamic
> +linker to detect an incompatible version of @file{libc.so}.  For
> +example, @var{string} could be the full package version and release
> +string used by a distribution build of @theglibc{}.  This way,
> +concurrent process creation during a package update will fail with an
> +error message, @emph{ld.so/libc.so mismatch detected (upgrade in
> +progress?)}, rather than crashing mysteriously.
> +
>  @item --with-timeoutfactor=@var{NUM}
>  Specify an integer @var{NUM} to scale the timeout of test programs.
>  This factor can be changed at run time using @env{TIMEOUTFACTOR}
> diff --git a/scripts/libc_early_init_name.py b/scripts/libc_early_init_name.py
> new file mode 100644
> index 0000000000..a56c2008f3
> --- /dev/null
> +++ b/scripts/libc_early_init_name.py
> @@ -0,0 +1,89 @@
> +#!/usr/bin/python3
> +# Compute the hash-based name of the __libc_early_init function.
> +# Copyright (C) 2022 Free Software Foundation, Inc.
> +# This file is part of the GNU C Library.
> +#
> +# The GNU C Library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +#
> +# The GNU C Library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with the GNU C Library; if not, see
> +# <https://www.gnu.org/licenses/>.
> +
> +"""Compute the name of the __libc_early_init function, which is used
> +as a protocol version marker between ld.so and libc.so.
> +
> +The name contains a hash suffix, and the hash changes if certain key
> +files in the source tree change.  Distributions can also configure
> +with --with-extra-version-id, to make the computed hash dependent on
> +the package version.
> +
> +"""
> +
> +import argparse
> +import hashlib
> +import os
> +import string
> +import sys
> +
> +def gnu_hash(s):
> +    """Computes the GNU hash of the string."""
> +    h = 5381
> +    for ch in s:
> +        if type(ch) is not int:
> +            ch = ord(ch)
> +        h = (h * 33 + ch) & 0xffffffff
> +    return h

OK.

> +
> +# Parse the command line.
> +parser = argparse.ArgumentParser(description=__doc__)
> +parser.add_argument('--output', metavar='PATH',
> +                    help='path to header file this tool generates')
> +parser.add_argument('--extra-version-id', metavar='ID',
> +                    help='extra string to influence hash computation')
> +parser.add_argument('inputs', metavar='PATH', nargs='*',
> +                    help='files whose contents influences the generated hash')
> +opts = parser.parse_args()
> +
> +# Obtain the blobs that affect the generated hash.
> +blobs = [(opts.extra_version_id or '').encode('UTF-8')]
> +for path in opts.inputs:
> +    with open(path, 'rb') as inp:
> +        blobs.append(inp.read())
> +
> +# Hash the file boundaries.
> +md = hashlib.sha256()
> +md.update(repr([len(blob) for blob in blobs]).encode('UTF-8'))
> +
> +# And then hash the file contents.  Do not hash the paths, to avoid
> +# impacting reproducibility.
> +for blob in blobs:
> +    md.update(blob)
> +
> +# These are the bits used to compute the suffix.
> +derived_bits = int.from_bytes(md.digest(), byteorder='big', signed=False)
> +
> +# These digits are used in the suffix (should result in base-62 encoding).
> +# They must be valid in C identifiers.
> +digits = string.digits + string.ascii_letters
> +
> +# Generate eight digits as a suffix.  They should provide enough
> +# uniqueness (47.6 bits).
> +name = '__libc_early_init_'
> +for n in range(8):
> +    name += digits[derived_bits % len(digits)]
> +    derived_bits //= len(digits)
> +
> +# Write the output file.
> +with open(opts.output, 'w') if opts.output else sys.stdout as out:
> +    out.write('#define LIBC_EARLY_INIT_NAME {}\n'.format(name))
> +    out.write('#define LIBC_EARLY_INIT_NAME_STRING "{}"\n'.format(name))
> +    out.write('#define LIBC_EARLY_INIT_GNU_HASH {}\n'.format(
> +        gnu_hash(name)))
> diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
> index 050a3032de..275dbc95ce 100644
> --- a/sysdeps/generic/ldsodefs.h
> +++ b/sysdeps/generic/ldsodefs.h
> @@ -333,6 +333,10 @@ struct rtld_global
>         its link map.  */
>      struct link_map *libc_map;
>  
> +    /* __libc_early_init function in libc_map.  Initialized at the
> +       same time as libc_map.  */
> +    void (*libc_map_early_init) (_Bool);
> +
>      /* Search table for unique objects.  */
>      struct unique_sym_table
>      {
> 
> base-commit: 06e4033c83276ed349d315bfbf651be56c3e2954
> 


-- 
Cheers,
Carlos.



More information about the Libc-alpha mailing list