[PATCH v4] Detect ld.so and libc.so version inconsistency during startup
Carlos O'Donell
carlos@redhat.com
Wed Aug 24 15:15:38 GMT 2022
On 8/24/22 07:09, Florian Weimer wrote:
> The files NEWS, include/link.h, and sysdeps/generic/ldsodefs.h
> contribute to the version fingerprint used for detection. The
> fingerprint can be further refined using the --with-extra-version-id
> configure argument.
>
> _dl_call_libc_early_init is replaced with _dl_lookup_libc_early_init.
> The new function is used store a pointer to libc.so's
> __libc_early_init function in the libc_map_early_init member of the
> ld.so namespace structure. This function pointer can then be called
> directly, so the separate invocation function is no longer needed.
>
> The versioned symbol lookup needs the symbol versioning data
> structures, so the initialization of libc_map and libc_map_early_init
> is now done from _dl_check_map_versions, after this information
> becomes available. (_dl_map_object_from_fd does not set this up
> in time, so the initialization code had to be moved from there.)
> This means that the separate initialization code can be removed from
> dl_main because _dl_check_map_versions covers all maps, including
> the initial executable loaded by the kernel. The lookup still happens
> before relocation and the invocation of IFUNC resolvers, so IFUNC
> resolvers are protected from ABI mismatch.
>
> The __libc_early_init function pointer is not protected because
> so little code runs between the pointer write and the invocation
> (only dynamic linker code and IFUNC resolvers).
LGTM.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
> ---
> v4: Drop build-time dependency on glibcelf for now. I will try to post
> a reworked glibcelf later this week. Then we can eliminate the code
> duplication.
>
> INSTALL | 10 +++
> Makerules | 14 ++++
> NEWS | 7 +-
> config.make.in | 1 +
> configure | 11 +++
> configure.ac | 5 ++
> elf/Makefile | 2 +-
> elf/Versions | 4 +-
> elf/dl-load.c | 9 ---
> ...bc-early-init.c => dl-lookup_libc_early_init.c} | 23 +++---
> elf/dl-open.c | 4 +-
> elf/dl-version.c | 18 +++++
> elf/libc-early-init.h | 21 +++--
> elf/rtld.c | 12 +--
> manual/install.texi | 9 +++
> scripts/libc_early_init_name.py | 89 ++++++++++++++++++++++
> sysdeps/generic/ldsodefs.h | 4 +
> 17 files changed, 198 insertions(+), 45 deletions(-)
>
> diff --git a/INSTALL b/INSTALL
> index 659f75a97f..6470cd3d25 100644
> --- a/INSTALL
> +++ b/INSTALL
> @@ -120,6 +120,16 @@ if 'CFLAGS' is specified it must enable optimization. For example:
> compiler flags which target a later instruction set architecture
> (ISA).
>
> +'--with-extra-version-id=STRING'
> + Use STRING as part of the fingerprint that is used by the dynamic
> + linker to detect an incompatible version of 'libc.so'. For
> + example, STRING could be the full package version and release
> + string used by a distribution build of the GNU C Library. This
> + way, concurrent process creation during a package update will fail
> + with an error message, _error while loading shared libraries:
> + /lib64/libc.so.6: ld.so/libc.so mismatch detected (upgrade in
> + progress?)_, rather than crashing mysteriously.
> +
> '--with-timeoutfactor=NUM'
> Specify an integer NUM to scale the timeout of test programs. This
> factor can be changed at run time using 'TIMEOUTFACTOR' environment
> diff --git a/Makerules b/Makerules
> index d1e139d03c..756c1f181c 100644
> --- a/Makerules
> +++ b/Makerules
> @@ -112,6 +112,20 @@ before-compile := $(common-objpfx)first-versions.h \
> $(common-objpfx)ldbl-compat-choose.h $(before-compile)
> $(common-objpfx)first-versions.h: $(common-objpfx)versions.stmp
> $(common-objpfx)ldbl-compat-choose.h: $(common-objpfx)versions.stmp
> +
> +# libc_early_init_name.h provides the actual name of the
> +# __libc_early_init function. It is used as a protocol version marker
> +# between ld.so and libc.so
> +before-compile := $(common-objpfx)libc_early_init_name.h $(before-compile)
> +libc_early_init_name-deps = \
> + $(..)NEWS $(..)sysdeps/generic/ldsodefs.h $(..)include/link.h
> +$(common-objpfx)libc_early_init_name.h: $(..)scripts/libc_early_init_name.py \
> + $(common-objpfx)config.make $(libc_early_init_name-deps)
> + $(PYTHON) $(..)scripts/libc_early_init_name.py \
> + --output=$@T \
> + --extra-version-id="$(extra-version-id)" \
> + $(libc_early_init_name-deps)
> + $(move-if-change) $@T $@
> endif # avoid-generated
> endif # $(build-shared) = yes
>
> diff --git a/NEWS b/NEWS
> index f9bef48a8f..9d3c8c5ed8 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -9,7 +9,12 @@ Version 2.37
>
> Major new features:
>
> - [Add new features here]
> +* The dynamic loader now prints an error message, "ld.so/libc.so
> + mismatch detected (upgrade in progress?)" if it detects that the
> + version of libc.so it loaded comes from a different build of glibc.
> + The new configure option --with-extra-version-id can be used to
> + specify an arbitrary string that affects the computation of the
> + version fingerprint.
>
> Deprecated and removed features, and other changes affecting compatibility:
>
> diff --git a/config.make.in b/config.make.in
> index d7c416cbea..ecaffbfd4b 100644
> --- a/config.make.in
> +++ b/config.make.in
> @@ -98,6 +98,7 @@ build-hardcoded-path-in-tests= @hardcoded_path_in_tests@
> build-pt-chown = @build_pt_chown@
> have-tunables = @have_tunables@
> pthread-in-libc = @pthread_in_libc@
> +extra-version-id = @extra_version_id@
>
> # Build tools.
> CC = @CC@
> diff --git a/configure b/configure
> index ff2c406b3b..c576f9f133 100755
> --- a/configure
> +++ b/configure
> @@ -760,6 +760,7 @@ with_headers
> with_default_link
> with_nonshared_cflags
> with_rtld_early_cflags
> +with_extra_version_id
> with_timeoutfactor
> enable_sanity_checks
> enable_shared
> @@ -1481,6 +1482,9 @@ Optional Packages:
> build nonshared libraries with additional CFLAGS
> --with-rtld-early-cflags=CFLAGS
> build early initialization with additional CFLAGS
> + --extra-version-id=STRING
> + specify an extra version string to use in internal
> + ABI checks
> --with-timeoutfactor=NUM
> specify an integer to scale the timeout
> --with-cpu=CPU select code for CPU variant
> @@ -3397,6 +3401,13 @@ fi
>
>
>
> +# Check whether --with-extra-version-id was given.
> +if test "${with_extra_version_id+set}" = set; then :
> + withval=$with_extra_version_id; extra_version_id="$withval"
> +fi
> +
> +
> +
> # Check whether --with-timeoutfactor was given.
> if test "${with_timeoutfactor+set}" = set; then :
> withval=$with_timeoutfactor; timeoutfactor=$withval
> diff --git a/configure.ac b/configure.ac
> index eb5bc6a131..68baeee4d7 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -169,6 +169,11 @@ AC_ARG_WITH([rtld-early-cflags],
> [rtld_early_cflags=])
> AC_SUBST(rtld_early_cflags)
>
> +AC_ARG_WITH([extra-version-id],
> + AS_HELP_STRING([--extra-version-id=STRING],
> + [specify an extra version string to use in internal ABI checks]),
> + [extra_version_id="$withval"])
> +
> AC_ARG_WITH([timeoutfactor],
> AS_HELP_STRING([--with-timeoutfactor=NUM],
> [specify an integer to scale the timeout]),
> diff --git a/elf/Makefile b/elf/Makefile
> index 3928a08787..bc68150a37 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -52,7 +52,6 @@ routines = \
> # The core dynamic linking functions are in libc for the static and
> # profiled libraries.
> dl-routines = \
> - dl-call-libc-early-init \
> dl-close \
> dl-debug \
> dl-debug-symbols \
> @@ -65,6 +64,7 @@ dl-routines = \
> dl-load \
> dl-lookup \
> dl-lookup-direct \
> + dl-lookup_libc_early_init \
> dl-minimal-malloc \
> dl-misc \
> dl-object \
> diff --git a/elf/Versions b/elf/Versions
> index a9ff278de7..6260c0fe03 100644
> --- a/elf/Versions
> +++ b/elf/Versions
> @@ -29,8 +29,8 @@ libc {
> __placeholder_only_for_empty_version_map;
> }
> GLIBC_PRIVATE {
> - # functions used in other libraries
> - __libc_early_init;
> + # A pattern is needed here because the suffix is dynamically generated.
> + __libc_early_init_*;
>
> # Internal error handling support. Interposes the functions in ld.so.
> _dl_signal_exception; _dl_catch_exception;
> diff --git a/elf/dl-load.c b/elf/dl-load.c
> index 1ad0868dad..00e08b5500 100644
> --- a/elf/dl-load.c
> +++ b/elf/dl-load.c
> @@ -31,7 +31,6 @@
> #include <sys/param.h>
> #include <sys/stat.h>
> #include <sys/types.h>
> -#include <gnu/lib-names.h>
>
> /* Type for the buffer we put the ELF header and hopefully the program
> header. This buffer does not really have to be too large. In most
> @@ -1466,14 +1465,6 @@ cannot enable executable stack as shared object requires");
> add_name_to_object (l, ((const char *) D_PTR (l, l_info[DT_STRTAB])
> + l->l_info[DT_SONAME]->d_un.d_val));
>
> - /* If we have newly loaded libc.so, update the namespace
> - description. */
> - if (GL(dl_ns)[nsid].libc_map == NULL
> - && l->l_info[DT_SONAME] != NULL
> - && strcmp (((const char *) D_PTR (l, l_info[DT_STRTAB])
> - + l->l_info[DT_SONAME]->d_un.d_val), LIBC_SO) == 0)
> - GL(dl_ns)[nsid].libc_map = l;
> -
> /* _dl_close can only eventually undo the module ID assignment (via
> remove_slotinfo) if this function returns a pointer to a link
> map. Therefore, delay this step until all possibilities for
> diff --git a/elf/dl-call-libc-early-init.c b/elf/dl-lookup_libc_early_init.c
> similarity index 66%
> rename from elf/dl-call-libc-early-init.c
> rename to elf/dl-lookup_libc_early_init.c
> index ee9860e3ab..64bc287a05 100644
> --- a/elf/dl-call-libc-early-init.c
> +++ b/elf/dl-lookup_libc_early_init.c
> @@ -1,4 +1,4 @@
> -/* Invoke the early initialization function in libc.so.
> +/* Find the address of the __libc_early_init function.
> Copyright (C) 2020-2022 Free Software Foundation, Inc.
> This file is part of the GNU C Library.
>
> @@ -16,26 +16,21 @@
> License along with the GNU C Library; if not, see
> <https://www.gnu.org/licenses/>. */
>
> -#include <assert.h>
> #include <ldsodefs.h>
> #include <libc-early-init.h>
> #include <link.h>
> #include <stddef.h>
>
> -void
> -_dl_call_libc_early_init (struct link_map *libc_map, _Bool initial)
> +__typeof (__libc_early_init) *
> +_dl_lookup_libc_early_init (struct link_map *libc_map)
> {
> - /* There is nothing to do if we did not actually load libc.so. */
> - if (libc_map == NULL)
> - return;
> -
> const ElfW(Sym) *sym
> - = _dl_lookup_direct (libc_map, "__libc_early_init",
> - 0x069682ac, /* dl_new_hash output. */
> + = _dl_lookup_direct (libc_map, LIBC_EARLY_INIT_NAME_STRING,
> + LIBC_EARLY_INIT_GNU_HASH,
> "GLIBC_PRIVATE",
> 0x0963cf85); /* _dl_elf_hash output. */
> - assert (sym != NULL);
> - __typeof (__libc_early_init) *early_init
> - = DL_SYMBOL_ADDRESS (libc_map, sym);
> - early_init (initial);
> + if (sym == NULL)
> + _dl_signal_error (0, libc_map->l_name, NULL, "\
> +ld.so/libc.so mismatch detected (upgrade in progress?)");
> + return DL_SYMBOL_ADDRESS (libc_map, sym);
> }
> diff --git a/elf/dl-open.c b/elf/dl-open.c
> index a23e65926b..dcc24130fe 100644
> --- a/elf/dl-open.c
> +++ b/elf/dl-open.c
> @@ -760,8 +760,8 @@ dl_open_worker_begin (void *a)
> if (!args->libc_already_loaded)
> {
> /* dlopen cannot be used to load an initial libc by design. */
> - struct link_map *libc_map = GL(dl_ns)[args->nsid].libc_map;
> - _dl_call_libc_early_init (libc_map, false);
> + if (GL(dl_ns)[args->nsid].libc_map != NULL)
> + GL(dl_ns)[args->nsid].libc_map_early_init (false);
> }
>
> args->worker_continue = true;
> diff --git a/elf/dl-version.c b/elf/dl-version.c
> index cda0889209..d9ec44eed6 100644
> --- a/elf/dl-version.c
> +++ b/elf/dl-version.c
> @@ -23,6 +23,8 @@
> #include <string.h>
> #include <ldsodefs.h>
> #include <_itoa.h>
> +#include <gnu/lib-names.h>
> +#include <libc-early-init.h>
>
> #include <assert.h>
>
> @@ -359,6 +361,22 @@ _dl_check_map_versions (struct link_map *map, int verbose, int trace_mode)
> }
> }
>
> + /* Detect a libc.so loaded into this namespace. The
> + __libc_early_init lookup below means that we have to do this
> + after parsing the version data. */
> + if (GL(dl_ns)[map->l_ns].libc_map == NULL
> + && map->l_info[DT_SONAME] != NULL
> + && strcmp (((const char *) D_PTR (map, l_info[DT_STRTAB])
> + + map->l_info[DT_SONAME]->d_un.d_val), LIBC_SO) == 0)
> + {
> + /* Look up this symbol early to trigger a mismatch error before
> + relocation (which may call IFUNC resolvers, and those can
> + have an internal ABI dependency). */
> + GL(dl_ns)[map->l_ns].libc_map_early_init
> + = _dl_lookup_libc_early_init (map);
> + GL(dl_ns)[map->l_ns].libc_map = map;
> + }
> +
> /* When there is a DT_VERNEED entry with libc.so on DT_NEEDED, issue
> an error if there is a DT_RELR entry without GLIBC_ABI_DT_RELR
> dependency. */
> diff --git a/elf/libc-early-init.h b/elf/libc-early-init.h
> index a8edfadfb0..ac8c204bc7 100644
> --- a/elf/libc-early-init.h
> +++ b/elf/libc-early-init.h
> @@ -19,13 +19,10 @@
> #ifndef _LIBC_EARLY_INIT_H
> #define _LIBC_EARLY_INIT_H
>
> +#include <libc_early_init_name.h>
> +
> struct link_map;
>
> -/* If LIBC_MAP is not NULL, look up the __libc_early_init symbol in it
> - and call this function, with INITIAL as the argument. */
> -void _dl_call_libc_early_init (struct link_map *libc_map, _Bool initial)
> - attribute_hidden;
> -
> /* In the shared case, this function is defined in libc.so and invoked
> from ld.so (or on the fist static dlopen) after complete relocation
> of a new loaded libc.so, but before user-defined ELF constructors
> @@ -33,6 +30,18 @@ void _dl_call_libc_early_init (struct link_map *libc_map, _Bool initial)
> startup code. If INITIAL is true, the libc being initialized is
> the libc for the main program. INITIAL is false for libcs loaded
> for audit modules, dlmopen, and static dlopen. */
> -void __libc_early_init (_Bool initial);
> +void __libc_early_init (_Bool initial)
> +#ifdef SHARED
> +/* Redirect to the actual implementation name. */
> + __asm__ (LIBC_EARLY_INIT_NAME_STRING)
> +#endif
> + ;
> +
> +/* Attempts to find the appropriately named __libc_early_init function
> + in LIBC_MAP. On lookup failure, an exception is signaled,
> + indicating an ld.so/libc.so mismatch. */
> +__typeof (__libc_early_init) *_dl_lookup_libc_early_init (struct link_map *
> + libc_map)
> + attribute_hidden;
>
> #endif /* _LIBC_EARLY_INIT_H */
> diff --git a/elf/rtld.c b/elf/rtld.c
> index cbbaf4a331..910075c37f 100644
> --- a/elf/rtld.c
> +++ b/elf/rtld.c
> @@ -1707,15 +1707,6 @@ dl_main (const ElfW(Phdr) *phdr,
> /* Extract the contents of the dynamic section for easy access. */
> elf_get_dynamic_info (main_map, false, false);
>
> - /* If the main map is libc.so, update the base namespace to
> - refer to this map. If libc.so is loaded later, this happens
> - in _dl_map_object_from_fd. */
> - if (main_map->l_info[DT_SONAME] != NULL
> - && (strcmp (((const char *) D_PTR (main_map, l_info[DT_STRTAB])
> - + main_map->l_info[DT_SONAME]->d_un.d_val), LIBC_SO)
> - == 0))
> - GL(dl_ns)[LM_ID_BASE].libc_map = main_map;
> -
> /* Set up our cache of pointers into the hash table. */
> _dl_setup_hash (main_map);
> }
> @@ -2386,7 +2377,8 @@ dl_main (const ElfW(Phdr) *phdr,
> /* Relocation is complete. Perform early libc initialization. This
> is the initial libc, even if audit modules have been loaded with
> other libcs. */
> - _dl_call_libc_early_init (GL(dl_ns)[LM_ID_BASE].libc_map, true);
> + if (GL(dl_ns)[LM_ID_BASE].libc_map != NULL)
> + GL(dl_ns)[LM_ID_BASE].libc_map_early_init (true);
>
> /* Do any necessary cleanups for the startup OS interface code.
> We do these now so that no calls are made after rtld re-relocation
> diff --git a/manual/install.texi b/manual/install.texi
> index c775005581..6d43599a47 100644
> --- a/manual/install.texi
> +++ b/manual/install.texi
> @@ -144,6 +144,15 @@ dynamic linker diagnostics to run on CPUs which are not compatible with
> the rest of @theglibc{}, for example, due to compiler flags which target
> a later instruction set architecture (ISA).
>
> +@item --with-extra-version-id=@var{string}
> +Use @var{string} as part of the fingerprint that is used by the dynamic
> +linker to detect an incompatible version of @file{libc.so}. For
> +example, @var{string} could be the full package version and release
> +string used by a distribution build of @theglibc{}. This way,
> +concurrent process creation during a package update will fail with an
> +error message, @emph{ld.so/libc.so mismatch detected (upgrade in
> +progress?)}, rather than crashing mysteriously.
> +
> @item --with-timeoutfactor=@var{NUM}
> Specify an integer @var{NUM} to scale the timeout of test programs.
> This factor can be changed at run time using @env{TIMEOUTFACTOR}
> diff --git a/scripts/libc_early_init_name.py b/scripts/libc_early_init_name.py
> new file mode 100644
> index 0000000000..a56c2008f3
> --- /dev/null
> +++ b/scripts/libc_early_init_name.py
> @@ -0,0 +1,89 @@
> +#!/usr/bin/python3
> +# Compute the hash-based name of the __libc_early_init function.
> +# Copyright (C) 2022 Free Software Foundation, Inc.
> +# This file is part of the GNU C Library.
> +#
> +# The GNU C Library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +#
> +# The GNU C Library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with the GNU C Library; if not, see
> +# <https://www.gnu.org/licenses/>.
> +
> +"""Compute the name of the __libc_early_init function, which is used
> +as a protocol version marker between ld.so and libc.so.
> +
> +The name contains a hash suffix, and the hash changes if certain key
> +files in the source tree change. Distributions can also configure
> +with --with-extra-version-id, to make the computed hash dependent on
> +the package version.
> +
> +"""
> +
> +import argparse
> +import hashlib
> +import os
> +import string
> +import sys
> +
> +def gnu_hash(s):
> + """Computes the GNU hash of the string."""
> + h = 5381
> + for ch in s:
> + if type(ch) is not int:
> + ch = ord(ch)
> + h = (h * 33 + ch) & 0xffffffff
> + return h
OK.
> +
> +# Parse the command line.
> +parser = argparse.ArgumentParser(description=__doc__)
> +parser.add_argument('--output', metavar='PATH',
> + help='path to header file this tool generates')
> +parser.add_argument('--extra-version-id', metavar='ID',
> + help='extra string to influence hash computation')
> +parser.add_argument('inputs', metavar='PATH', nargs='*',
> + help='files whose contents influences the generated hash')
> +opts = parser.parse_args()
> +
> +# Obtain the blobs that affect the generated hash.
> +blobs = [(opts.extra_version_id or '').encode('UTF-8')]
> +for path in opts.inputs:
> + with open(path, 'rb') as inp:
> + blobs.append(inp.read())
> +
> +# Hash the file boundaries.
> +md = hashlib.sha256()
> +md.update(repr([len(blob) for blob in blobs]).encode('UTF-8'))
> +
> +# And then hash the file contents. Do not hash the paths, to avoid
> +# impacting reproducibility.
> +for blob in blobs:
> + md.update(blob)
> +
> +# These are the bits used to compute the suffix.
> +derived_bits = int.from_bytes(md.digest(), byteorder='big', signed=False)
> +
> +# These digits are used in the suffix (should result in base-62 encoding).
> +# They must be valid in C identifiers.
> +digits = string.digits + string.ascii_letters
> +
> +# Generate eight digits as a suffix. They should provide enough
> +# uniqueness (47.6 bits).
> +name = '__libc_early_init_'
> +for n in range(8):
> + name += digits[derived_bits % len(digits)]
> + derived_bits //= len(digits)
> +
> +# Write the output file.
> +with open(opts.output, 'w') if opts.output else sys.stdout as out:
> + out.write('#define LIBC_EARLY_INIT_NAME {}\n'.format(name))
> + out.write('#define LIBC_EARLY_INIT_NAME_STRING "{}"\n'.format(name))
> + out.write('#define LIBC_EARLY_INIT_GNU_HASH {}\n'.format(
> + gnu_hash(name)))
> diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
> index 050a3032de..275dbc95ce 100644
> --- a/sysdeps/generic/ldsodefs.h
> +++ b/sysdeps/generic/ldsodefs.h
> @@ -333,6 +333,10 @@ struct rtld_global
> its link map. */
> struct link_map *libc_map;
>
> + /* __libc_early_init function in libc_map. Initialized at the
> + same time as libc_map. */
> + void (*libc_map_early_init) (_Bool);
> +
> /* Search table for unique objects. */
> struct unique_sym_table
> {
>
> base-commit: 06e4033c83276ed349d315bfbf651be56c3e2954
>
--
Cheers,
Carlos.
More information about the Libc-alpha
mailing list