[PATCH 2/2] Detect ld.so and libc.so version inconsistency during startup
Carlos O'Donell
carlos@redhat.com
Tue Aug 23 15:13:00 GMT 2022
On 8/19/22 06:16, Florian Weimer via Libc-alpha wrote:
> The files NEWS, include/link.h, and sysdeps/generic/ldsodefs.h
> contribute to the version fingerprint used for detection. The
> fingerprint can be further refined using the --with-extra-version-id
> configure argument.
It makes sense to use link.h, ldsodefs.h because they define the interface between
ld.so and libc. NEWS and the additional --with-extra-version-id provide further
differentiation, which allows per-distro NEVRA bumping to ensure this works
reliably during upgrade.
This solution will fail process starts during the upgrade window where there
is a mismatch and the system will need to restart those processes.
Needs a v2 to fix typos.
Needs a v2 to fix the implementation.
This solution doesn't quite work and I haven't debugged it.
Works:
[carlos@athas glibc-review]$ ./elf/ld.so --library-path /lib64 /bin/true
Fatal glibc error: ld.so/libc.so mismatch detected
Fails:
[carlos@athas glibc-review]$ ./elf/ld.so --library-path /lib64 /bin/bash
Segmentation fault (core dumped)
I would expect the same mismatch detected message. This is possibly as
difficult a problem as the early ISA detection diagnostic since it can
involve resolving early ifuncs and running that code early means running
mismatched ld.so/libc ifuncs.
LD_DEBUG=all shows:
886519: symbol=gettimeofday; lookup in file=/bin/bash [0]
886519: symbol=gettimeofday; lookup in file=/lib64/libtinfo.so.6 [0]
886519: symbol=gettimeofday; lookup in file=/lib64/libc.so.6 [0]
Segmentation fault (core dumped)
I'm not sure what is going wrong here. Early vDSO setup? Early ifunc?
> ---
> INSTALL | 9 ++++
> Makerules | 14 ++++++
> NEWS | 7 ++-
> config.make.in | 1 +
> configure | 11 +++++
> configure.ac | 5 ++
> csu/libc-start.c | 1 +
> elf/Versions | 4 +-
> elf/dl-call-libc-early-init.c | 9 ++--
> elf/libc-early-init.h | 9 +++-
> manual/install.texi | 9 ++++
> scripts/libc_early_init_name.py | 85 +++++++++++++++++++++++++++++++++
> 12 files changed, 156 insertions(+), 8 deletions(-)
> create mode 100644 scripts/libc_early_init_name.py
>
> diff --git a/INSTALL b/INSTALL
> index 659f75a97f..7ad8f96e63 100644
> --- a/INSTALL
> +++ b/INSTALL
> @@ -120,6 +120,15 @@ if 'CFLAGS' is specified it must enable optimization. For example:
> compiler flags which target a later instruction set architecture
> (ISA).
>
> +'--with-extra-version-id=STRING'
> + Use STRING as part of the fingerprint that is used by the dynamic
> + linker to detect an incompatible version of 'libc.so'. For
> + example, STRING could be the full package version and release
> + string used by a distribution build of the GNU C Library. This
> + way, concurrent process creation during a package update will faill
s/fail/fail/g
> + with an error message, _Fatal glibc error: ld.so/libc.so mismatch
> + detected_, rather than crashing mysteriously.
OK.
> +
> '--with-timeoutfactor=NUM'
> Specify an integer NUM to scale the timeout of test programs. This
> factor can be changed at run time using 'TIMEOUTFACTOR' environment
> diff --git a/Makerules b/Makerules
> index d1e139d03c..756c1f181c 100644
> --- a/Makerules
> +++ b/Makerules
> @@ -112,6 +112,20 @@ before-compile := $(common-objpfx)first-versions.h \
> $(common-objpfx)ldbl-compat-choose.h $(before-compile)
> $(common-objpfx)first-versions.h: $(common-objpfx)versions.stmp
> $(common-objpfx)ldbl-compat-choose.h: $(common-objpfx)versions.stmp
> +
> +# libc_early_init_name.h provides the actual name of the
> +# __libc_early_init function. It is used as a protocol version marker
> +# between ld.so and libc.so
OK. Good comment.
> +before-compile := $(common-objpfx)libc_early_init_name.h $(before-compile)
> +libc_early_init_name-deps = \
> + $(..)NEWS $(..)sysdeps/generic/ldsodefs.h $(..)include/link.h
> +$(common-objpfx)libc_early_init_name.h: $(..)scripts/libc_early_init_name.py \
> + $(common-objpfx)config.make $(libc_early_init_name-deps)
> + $(PYTHON) $(..)scripts/libc_early_init_name.py \
> + --output=$@T \
> + --extra-version-id="$(extra-version-id)" \
> + $(libc_early_init_name-deps)
> + $(move-if-change) $@T $@
OK.
> endif # avoid-generated
> endif # $(build-shared) = yes
>
> diff --git a/NEWS b/NEWS
> index f9bef48a8f..b8a9376e1e 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -9,7 +9,12 @@ Version 2.37
>
> Major new features:
>
> - [Add new features here]
> +* The dynamic loader now prints an error message, "Fatal glibc error:
> + ld.so/libc.so mismatch detected" if it detects that the version of
> + libc.so it loaded comes from a different build of glibc. The new
> + configure option --with-extra-version-id can be used to specify an
> + arbitrary string that affects the comptuation of the version
> + fingerprint.
OK.
>
> Deprecated and removed features, and other changes affecting compatibility:
>
> diff --git a/config.make.in b/config.make.in
> index d7c416cbea..ecaffbfd4b 100644
> --- a/config.make.in
> +++ b/config.make.in
> @@ -98,6 +98,7 @@ build-hardcoded-path-in-tests= @hardcoded_path_in_tests@
> build-pt-chown = @build_pt_chown@
> have-tunables = @have_tunables@
> pthread-in-libc = @pthread_in_libc@
> +extra-version-id = @extra_version_id@
OK.
>
> # Build tools.
> CC = @CC@
> diff --git a/configure b/configure
> index ff2c406b3b..c576f9f133 100755
> --- a/configure
> +++ b/configure
> @@ -760,6 +760,7 @@ with_headers
> with_default_link
> with_nonshared_cflags
> with_rtld_early_cflags
> +with_extra_version_id
> with_timeoutfactor
> enable_sanity_checks
> enable_shared
> @@ -1481,6 +1482,9 @@ Optional Packages:
> build nonshared libraries with additional CFLAGS
> --with-rtld-early-cflags=CFLAGS
> build early initialization with additional CFLAGS
> + --extra-version-id=STRING
> + specify an extra version string to use in internal
> + ABI checks
OK.
> --with-timeoutfactor=NUM
> specify an integer to scale the timeout
> --with-cpu=CPU select code for CPU variant
> @@ -3397,6 +3401,13 @@ fi
>
>
>
> +# Check whether --with-extra-version-id was given.
> +if test "${with_extra_version_id+set}" = set; then :
> + withval=$with_extra_version_id; extra_version_id="$withval"
> +fi
> +
> +
> +
> # Check whether --with-timeoutfactor was given.
> if test "${with_timeoutfactor+set}" = set; then :
> withval=$with_timeoutfactor; timeoutfactor=$withval
> diff --git a/configure.ac b/configure.ac
> index eb5bc6a131..68baeee4d7 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -169,6 +169,11 @@ AC_ARG_WITH([rtld-early-cflags],
> [rtld_early_cflags=])
> AC_SUBST(rtld_early_cflags)
>
> +AC_ARG_WITH([extra-version-id],
> + AS_HELP_STRING([--extra-version-id=STRING],
> + [specify an extra version string to use in internal ABI checks]),
> + [extra_version_id="$withval"])
> +
OK.
> AC_ARG_WITH([timeoutfactor],
> AS_HELP_STRING([--with-timeoutfactor=NUM],
> [specify an integer to scale the timeout]),
> diff --git a/csu/libc-start.c b/csu/libc-start.c
> index 543560f36c..522f14bbaf 100644
> --- a/csu/libc-start.c
> +++ b/csu/libc-start.c
> @@ -37,6 +37,7 @@
> #include <elf-initfini.h>
> #include <shlib-compat.h>
>
> +#include <elf/libc-early-init.h>
OK.
> #include <elf/dl-tunables.h>
>
> extern void __libc_init_first (int argc, char **argv, char **envp);
> diff --git a/elf/Versions b/elf/Versions
> index a9ff278de7..6260c0fe03 100644
> --- a/elf/Versions
> +++ b/elf/Versions
> @@ -29,8 +29,8 @@ libc {
> __placeholder_only_for_empty_version_map;
> }
> GLIBC_PRIVATE {
> - # functions used in other libraries
> - __libc_early_init;
> + # A pattern is needed here because the suffix is dynamically generated.
> + __libc_early_init_*;
OK.
I see this in the final build:
158: 0000000000148ed0 185 FUNC GLOBAL DEFAULT 10 __libc_early_init_iw1yORkW@@GLIBC_PRIVATE
>
> # Internal error handling support. Interposes the functions in ld.so.
> _dl_signal_exception; _dl_catch_exception;
> diff --git a/elf/dl-call-libc-early-init.c b/elf/dl-call-libc-early-init.c
> index ee9860e3ab..1d9436ad50 100644
> --- a/elf/dl-call-libc-early-init.c
> +++ b/elf/dl-call-libc-early-init.c
> @@ -16,7 +16,6 @@
> License along with the GNU C Library; if not, see
> <https://www.gnu.org/licenses/>. */
>
> -#include <assert.h>
> #include <ldsodefs.h>
> #include <libc-early-init.h>
> #include <link.h>
> @@ -30,11 +29,13 @@ _dl_call_libc_early_init (struct link_map *libc_map, _Bool initial)
> return;
>
> const ElfW(Sym) *sym
> - = _dl_lookup_direct (libc_map, "__libc_early_init",
> - 0x069682ac, /* dl_new_hash output. */
> + = _dl_lookup_direct (libc_map, LIBC_EARLY_INIT_NAME_STRING,
> + LIBC_EARLY_INIT_GNU_HASH,
> "GLIBC_PRIVATE",
OK.
> 0x0963cf85); /* _dl_elf_hash output. */
> - assert (sym != NULL);
> + if (sym == NULL)
> + _dl_fatal_printf (
> +"Fatal glibc error: ld.so/libc.so mismatch detected\n");
OK. Look for the new version-specific symbol in libc.
> __typeof (__libc_early_init) *early_init
> = DL_SYMBOL_ADDRESS (libc_map, sym);
> early_init (initial);
> diff --git a/elf/libc-early-init.h b/elf/libc-early-init.h
> index a8edfadfb0..6774c3e7c0 100644
> --- a/elf/libc-early-init.h
> +++ b/elf/libc-early-init.h
> @@ -19,6 +19,8 @@
> #ifndef _LIBC_EARLY_INIT_H
> #define _LIBC_EARLY_INIT_H
>
> +#include <libc_early_init_name.h>
> +
> struct link_map;
>
> /* If LIBC_MAP is not NULL, look up the __libc_early_init symbol in it
> @@ -33,6 +35,11 @@ void _dl_call_libc_early_init (struct link_map *libc_map, _Bool initial)
> startup code. If INITIAL is true, the libc being initialized is
> the libc for the main program. INITIAL is false for libcs loaded
> for audit modules, dlmopen, and static dlopen. */
> -void __libc_early_init (_Bool initial);
> +void __libc_early_init (_Bool initial)
> +#ifdef SHARED
> +/* Redirect to the actual implementation name. */
> + __asm__ (LIBC_EARLY_INIT_NAME_STRING)
OK.
> +#endif
> + ;
>
> #endif /* _LIBC_EARLY_INIT_H */
> diff --git a/manual/install.texi b/manual/install.texi
> index c775005581..8b225ab3bb 100644
> --- a/manual/install.texi
> +++ b/manual/install.texi
> @@ -144,6 +144,15 @@ dynamic linker diagnostics to run on CPUs which are not compatible with
> the rest of @theglibc{}, for example, due to compiler flags which target
> a later instruction set architecture (ISA).
>
> +@item --with-extra-version-id=@var{string}
> +Use @var{string} as part of the fingerprint that is used by the dynamic
> +linker to detect an incompatible version of @file{libc.so}. For
> +example, @var{string} could be the full package version and release
> +string used by a distribution build of @theglibc{}. This way,
> +concurrent process creation during a package update will faill with an
s/faill/fail/g
> +error message, @emph{Fatal glibc error: ld.so/libc.so mismatch detected},
> +rather than crashing mysteriously.
> +
> @item --with-timeoutfactor=@var{NUM}
> Specify an integer @var{NUM} to scale the timeout of test programs.
> This factor can be changed at run time using @env{TIMEOUTFACTOR}
> diff --git a/scripts/libc_early_init_name.py b/scripts/libc_early_init_name.py
> new file mode 100644
> index 0000000000..465cbe9dae
> --- /dev/null
> +++ b/scripts/libc_early_init_name.py
> @@ -0,0 +1,85 @@
> +#!/usr/bin/python3
> +# Compute the hash-based name of the __libc_early_init function.
> +# Copyright (C) 2022 Free Software Foundation, Inc.
> +# This file is part of the GNU C Library.
> +#
> +# The GNU C Library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +#
> +# The GNU C Library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with the GNU C Library; if not, see
> +# <https://www.gnu.org/licenses/>.
> +
> +"""Compute the name of the __libc_early_init function, which is used
> +as a protocol version marker between ld.so and libc.so.
> +
> +The name contains a hash suffix, and the hash changes if certain key
> +files in the source tree change. Distributions can also configure
> +with --with-extra-version-id, to make the computed hash dependent on
> +the package version.
OK.
> +
> +"""
> +
> +import argparse
> +import hashlib
> +import os
> +import string
> +import sys
> +
> +# Make available glibc Python modules.
> +sys.path.append(os.path.dirname(os.path.realpath(__file__)))
> +
> +import glibcelf
> +
> +# Parse the command line.
> +parser = argparse.ArgumentParser(description=__doc__)
> +parser.add_argument('--output', metavar='PATH',
> + help='path to header file this tool generates')
> +parser.add_argument('--extra-version-id', metavar='ID',
> + help='extra string to influence hash computation')
> +parser.add_argument('inputs', metavar='PATH', nargs='*',
> + help='files whose contents influences the generated hash')
> +opts = parser.parse_args()
> +
> +# Obtain the blobs that affect the generated hash.
> +blobs = [(opts.extra_version_id or '').encode('UTF-8')]
> +for path in opts.inputs:
> + with open(path, 'rb') as inp:
> + blobs.append(inp.read())
> +
> +# Hash the file boundaries.
> +md = hashlib.sha256()
> +md.update(repr([len(blob) for blob in blobs]).encode('UTF-8'))
> +
> +# And then hash the file contents. Do not hash the paths, to avoid
> +# impacting reproducibility.
> +for blob in blobs:
> + md.update(blob)
> +
> +# These are the bits used to compute the suffix.
> +derived_bits = int.from_bytes(md.digest(), byteorder='big', signed=False)
> +
> +# These digits are used in the suffix (should result in base-62 encoding).
> +# They must be valid in C identifiers.
> +digits = string.digits + string.ascii_letters
> +
> +# Generate eight digits as a suffix. They should provide enough
> +# uniqueness (47.6 bits).
> +name = '__libc_early_init_'
> +for n in range(8):
> + name += digits[derived_bits % len(digits)]
> + derived_bits //= len(digits)
> +
> +# Write the output file.
> +with open(opts.output, 'w') if opts.output else sys.stdout as out:
> + out.write('#define LIBC_EARLY_INIT_NAME {}\n'.format(name))
> + out.write('#define LIBC_EARLY_INIT_NAME_STRING "{}"\n'.format(name))
> + out.write('#define LIBC_EARLY_INIT_GNU_HASH {}\n'.format(
> + glibcelf.gnu_hash(name)))
--
Cheers,
Carlos.
More information about the Libc-alpha
mailing list