This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH][BZ 17251] Calculate RPATH $ORIGIN from absolute path
- From: Brennan Shacklett <bpshacklett at gmail dot com>
- To: libc-alpha at sourceware dot org
- Cc: Brennan Shacklett <bpshacklettbp at gmail dot com>
- Date: Tue, 26 Aug 2014 12:37:03 -0700
- Subject: Re: [PATCH][BZ 17251] Calculate RPATH $ORIGIN from absolute path
- Authentication-results: sourceware.org; auth=none
- References: <1407860209-21797-1-git-send-email-bpshacklett at gmail dot com>
On Tue, Aug 12, 2014 at 09:16:49AM -0700, Brennan Shacklett wrote:
> Bug 17251 deals with (what I believe is) the incorrect calculation of $ORIGIN
> for shared libraries. Currently $ORIGIN is calculated correctly for executables
> on linux, because readlink is called on /proc/self/exe, which means the
> resulting path is absolute and has no symlinks.
> Shared libraries with relative paths on the other hand are based
> off of appending the name / path of the library to the current working
> directory, which means if the library is a symlink, it is not followed, which
> breaks RPATH $ORIGIN in the following scenario:
>
> libone.so and libtwo.so are both in ~/lib
> libone.so needs libtwo.so, so libone.so has an RPATH of $ORIGIN.
> If I run ldd on ~/lib/libone.so, libtwo.so is found and all is good.
> If I create a symlink named ~/libone.so to ~/lib/libone.so, and run ldd on it
> libtwo is not found, because $ORIGIN for the library is calculated as ~
> instead of ~/lib.
>
> If I was to repeat the above test but instead of libone.so use an executable,
> everything would work as expected, which is why I think the shared library
> behavior is a bug.
>
> I ran into this in the real world when attempting to dynamically load a .so
> for python SWIG bindings, because the .so which python was loading was
> symlinked to the actual library directory where all of the .so's dependencies
> were located.
>
> The attached patch fixes this behavior by having realname evaluated with
> __realpath in _dl_map_object_from_fd, before _dl_new_object is called (which is
> where l_origin is assigned). This also somewhat simplifies the code in
> elf/dl-object.c, because realname is guaranteed to be an absolute path
> generated by realpath, so I was able to remove the code dealing with relative
> paths.
> I used the implementation of dl_realpath from sysdeps/tile/dl-runtime.c as the
> generic implementation, and extended it with lstat and readlink in the linux
> version. This means the bug is still present on other systems than linux (the
> generic implementation only returns an absolute path, it doesn't do anything
> with symlinks), but if there is a way to get the generic version to follow
> symlinks on all systems please let me know.
>
> I tested and ran the elf test suite on Gentoo Linux x86-64. If this change is
> wanted, I will happily write a test to go along with it.
> I don't have any sort of copyright attribution set up with the FSF, let me know
> if it is necessary for this change.
>
> Thanks,
> Brennan Shacklett
>
> 2014-08-12 Brennan Shacklett <bpshacklett@gmail.com>
>
> [BZ #17251]
> * elf/Makefile (rtld-routines): Add dl-realpath.
> * elf/dl-load.c (_dl_map_object_from_fd):
> Calculate realname with __realpath.
> * elf/dl-object.c (_dl_new_object): Remove relative path handling.
> * elf/dl-realpath.c: New file. Generic implementation of __realpath.
> * elf/tile/dl-runtime.c (_dl_after_load): Remove dl_realpath and
> change call to __realpath.
> * sysdeps/unix/sysv/linux/Makefile (sysdep-rtld-routines):
> Add dl-lxstat64.
> * sysdeps/unix/sysv/linux/dl-lxstat64.c: New file.
> * sysdeps/unix/sysv/linux/dl-realpath.c: New file. Handle symlinks
> unlike generic implementation.
> ---
> elf/Makefile | 3 +-
> elf/dl-load.c | 18 +++++
> elf/dl-object.c | 54 ++-------------
> elf/dl-realpath.c | 85 ++++++++++++++++++++++++
> sysdeps/tile/dl-runtime.c | 66 +------------------
> sysdeps/unix/sysv/linux/Makefile | 2 +-
> sysdeps/unix/sysv/linux/dl-lxstat64.c | 1 +
> sysdeps/unix/sysv/linux/dl-realpath.c | 121 ++++++++++++++++++++++++++++++++++
> 8 files changed, 236 insertions(+), 114 deletions(-)
> create mode 100644 elf/dl-realpath.c
> create mode 100644 sysdeps/unix/sysv/linux/dl-lxstat64.c
> create mode 100644 sysdeps/unix/sysv/linux/dl-realpath.c
>
> diff --git a/elf/Makefile b/elf/Makefile
> index 25012cc..c130f21 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -43,7 +43,8 @@ shared-only-routines += dl-caller
>
> # ld.so uses those routines, plus some special stuff for being the program
> # interpreter and operating independent of libc.
> -rtld-routines := rtld $(dl-routines) dl-sysdep dl-environ dl-minimal
> +rtld-routines := rtld $(dl-routines) dl-sysdep dl-environ dl-minimal \
> + dl-realpath
> all-rtld-routines = $(rtld-routines) $(sysdep-rtld-routines)
>
> CFLAGS-dl-runtime.c = -fexceptions -fasynchronous-unwind-tables
> diff --git a/elf/dl-load.c b/elf/dl-load.c
> index 016a99c..7353837 100644
> --- a/elf/dl-load.c
> +++ b/elf/dl-load.c
> @@ -891,6 +891,7 @@ _dl_map_object_from_fd (const char *name, int fd, struct filebuf *fbp,
> int errval = 0;
> struct r_debug *r = _dl_debug_initialize (0, nsid);
> bool make_consistent = false;
> + char *absolutename = NULL;
>
> /* Get file information. */
> if (__glibc_unlikely (__fxstat64 (_STAT_VER, fd, &st) < 0))
> @@ -902,6 +903,23 @@ _dl_map_object_from_fd (const char *name, int fd, struct filebuf *fbp,
> lose (errval, fd, name, realname, l, errstring,
> make_consistent ? r : NULL, nsid);
> }
> + /* Find absolute pathname for object */
> + absolutename = (char *) malloc (PATH_MAX);
> + if (!absolutename)
> + {
> + errstring = N_("cannot allocate memory for absolute path");
> + goto call_lose_errno;
> + }
> +
> + if (!__realpath (realname, absolutename))
> + {
> + free (absolutename);
> + errstring = N_("cannot find absolute path of shared object");
> + goto call_lose_errno;
> + }
> +
> + free (realname);
> + realname = absolutename;
>
> /* Look again to see if the real name matched another already loaded. */
> for (l = GL(dl_ns)[nsid]._ns_loaded; l; l = l->l_next)
> diff --git a/elf/dl-object.c b/elf/dl-object.c
> index afd80a6..d93935b 100644
> --- a/elf/dl-object.c
> +++ b/elf/dl-object.c
> @@ -158,55 +158,15 @@ _dl_new_object (char *realname, const char *libname, int type,
> char *origin;
> char *cp;
>
> - if (realname[0] == '/')
> + /* It is an absolute path, calculated by realpath. Use it.
> + * But we have to make a copy since we strip out the trailing slash. */
> + assert (realname[0] == '/');
> + cp = origin = (char *) malloc (realname_len);
> + if (origin == NULL)
> {
> - /* It is an absolute path. Use it. But we have to make a
> - copy since we strip out the trailing slash. */
> - cp = origin = (char *) malloc (realname_len);
> - if (origin == NULL)
> - {
> - origin = (char *) -1;
> - goto out;
> - }
> + origin = (char *) -1;
> + goto out;
> }
> - else
> - {
> - size_t len = realname_len;
> - char *result = NULL;
> -
> - /* Get the current directory name. */
> - origin = NULL;
> - do
> - {
> - char *new_origin;
> -
> - len += 128;
> - new_origin = (char *) realloc (origin, len);
> - if (new_origin == NULL)
> - /* We exit the loop. Note that result == NULL. */
> - break;
> - origin = new_origin;
> - }
> - while ((result = __getcwd (origin, len - realname_len)) == NULL
> - && errno == ERANGE);
> -
> - if (result == NULL)
> - {
> - /* We were not able to determine the current directory.
> - Note that free(origin) is OK if origin == NULL. */
> - free (origin);
> - origin = (char *) -1;
> - goto out;
> - }
> -
> - /* Find the end of the path and see whether we have to add a
> - slash. We could use rawmemchr but this need not be
> - fast. */
> - cp = (strchr) (origin, '\0');
> - if (cp[-1] != '/')
> - *cp++ = '/';
> - }
> -
> /* Add the real file name. */
> cp = __mempcpy (cp, realname, realname_len);
>
> diff --git a/elf/dl-realpath.c b/elf/dl-realpath.c
> new file mode 100644
> index 0000000..2549bb3
> --- /dev/null
> +++ b/elf/dl-realpath.c
> @@ -0,0 +1,85 @@
> +/* Dynamic loader version of realpath.
> + Copyright (C) 2014 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <http://www.gnu.org/licenses/>. */
> +
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <limits.h>
> +#include <ldsodefs.h>
> +
> +/* Simplified implementation of realpath(): no dynamic memory use, no lstat(),
> + no set_errno(), no valid "rpath" on error, etc. This simplifies cases
> + involving relative paths, specifically where $ORIGIN needs to be
> + calculated. For this relatively rare case, one could also imagine using
> + link_map.l_origin to avoid the getcwd() here, but the simpler code
> + here seems like a better solution. */
> +char * internal_function
> +__realpath (const char *name, char *rpath)
> +{
> + char *dest;
> + const char *start, *end;
> +
> + if (name[0] != '/')
> + {
> + if (!__getcwd (rpath, PATH_MAX))
> + return NULL;
> + dest = __rawmemchr (rpath, '\0');
> + }
> + else
> + {
> + rpath[0] = '/';
> + dest = rpath + 1;
> + }
> +
> + for (start = end = name; *start; start = end)
> + {
> + /* Skip sequence of multiple path-separators. */
> + while (*start == '/')
> + ++start;
> +
> + /* Find end of path component. */
> + for (end = start; *end && *end != '/'; ++end)
> + /* Nothing. */;
> +
> + if (end - start == 0)
> + break;
> + else if (end - start == 1 && start[0] == '.')
> + /* nothing */;
> + else if (end - start == 2 && start[0] == '.' && start[1] == '.')
> + {
> + /* Back up to previous component, ignore if at root already. */
> + if (dest > rpath + 1)
> + while ((--dest)[-1] != '/');
> + }
> + else
> + {
> + if (dest[-1] != '/')
> + *dest++ = '/';
> +
> + if (dest + (end - start) >= rpath + PATH_MAX)
> + return NULL;
> +
> + dest = __mempcpy (dest, start, end - start);
> + *dest = '\0';
> + }
> + }
> + if (dest > rpath + 1 && dest[-1] == '/')
> + --dest;
> + *dest = '\0';
> +
> + return rpath;
> +}
> diff --git a/sysdeps/tile/dl-runtime.c b/sysdeps/tile/dl-runtime.c
> index bcc00bc..45251d0 100644
> --- a/sysdeps/tile/dl-runtime.c
> +++ b/sysdeps/tile/dl-runtime.c
> @@ -29,70 +29,6 @@
> #include <arch/sim.h>
> #include <dl-unmap-segments.h>
>
> -/* Like realpath(), but simplified: no dynamic memory use, no lstat(),
> - no set_errno(), no valid "rpath" on error, etc. This handles some
> - simple cases where the simulator might not have a valid entry for
> - a loaded Elf object, in particular dlopen() with a relative path.
> - For this relatively rare case, one could also imagine using
> - link_map.l_origin to avoid the getcwd() here, but the simpler code
> - here seems like a better solution. */
> -static char *
> -dl_realpath (const char *name, char *rpath)
> -{
> - char *dest;
> - const char *start, *end;
> -
> - if (name[0] != '/')
> - {
> - if (!__getcwd (rpath, PATH_MAX))
> - return NULL;
> - dest = __rawmemchr (rpath, '\0');
> - }
> - else
> - {
> - rpath[0] = '/';
> - dest = rpath + 1;
> - }
> -
> - for (start = end = name; *start; start = end)
> - {
> - /* Skip sequence of multiple path-separators. */
> - while (*start == '/')
> - ++start;
> -
> - /* Find end of path component. */
> - for (end = start; *end && *end != '/'; ++end)
> - /* Nothing. */;
> -
> - if (end - start == 0)
> - break;
> - else if (end - start == 1 && start[0] == '.')
> - /* nothing */;
> - else if (end - start == 2 && start[0] == '.' && start[1] == '.')
> - {
> - /* Back up to previous component, ignore if at root already. */
> - if (dest > rpath + 1)
> - while ((--dest)[-1] != '/');
> - }
> - else
> - {
> - if (dest[-1] != '/')
> - *dest++ = '/';
> -
> - if (dest + (end - start) >= rpath + PATH_MAX)
> - return NULL;
> -
> - dest = __mempcpy (dest, start, end - start);
> - *dest = '\0';
> - }
> - }
> - if (dest > rpath + 1 && dest[-1] == '/')
> - --dest;
> - *dest = '\0';
> -
> - return rpath;
> -}
> -
> /* Support notifying the simulator about new objects. */
> void internal_function
> _dl_after_load (struct link_map *l)
> @@ -117,7 +53,7 @@ _dl_after_load (struct link_map *l)
> DLPUTC (':');
>
> /* Write the library path, including the terminating '\0'. */
> - path = dl_realpath (l->l_name, pathbuf) ?: l->l_name;
> + path = __realpath (l->l_name, pathbuf) ?: l->l_name;
> for (size_t i = 0;; i++)
> {
> DLPUTC (path[i]);
> diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
> index 9ad6d22..4525429 100644
> --- a/sysdeps/unix/sysv/linux/Makefile
> +++ b/sysdeps/unix/sysv/linux/Makefile
> @@ -176,7 +176,7 @@ endif
>
> ifeq ($(subdir),elf)
> sysdep-rtld-routines += dl-brk dl-sbrk dl-getcwd dl-openat64 dl-opendir \
> - dl-fxstatat64
> + dl-fxstatat64 dl-lxstat64
>
> CPPFLAGS-lddlibc4 += -DNOT_IN_libc
>
> diff --git a/sysdeps/unix/sysv/linux/dl-lxstat64.c b/sysdeps/unix/sysv/linux/dl-lxstat64.c
> new file mode 100644
> index 0000000..63e6800
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/dl-lxstat64.c
> @@ -0,0 +1 @@
> +#include <lxstat64.c>
> diff --git a/sysdeps/unix/sysv/linux/dl-realpath.c b/sysdeps/unix/sysv/linux/dl-realpath.c
> new file mode 100644
> index 0000000..91652dc
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/dl-realpath.c
> @@ -0,0 +1,121 @@
> +/* Dynamic loader version of realpath for linux.
> + Copyright (C) 2014 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <http://www.gnu.org/licenses/>. */
> +
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <limits.h>
> +#include <ldsodefs.h>
> +#include <sys/stat.h>
> +#include <kernel-features.h>
> +#include <sysdep.h>
> +
> +/* Simplified version of realpath which extends elf/dl-realpath.c to use
> + * linux syscalls for handling symlinks */
> +char * internal_function
> +__realpath (const char *name, char *rpath)
> +{
> + char *dest;
> + char extra_buf[PATH_MAX];
> + const char *start, *end;
> + int num_links = 0;
> +
> + if (name[0] != '/')
> + {
> + if (!__getcwd (rpath, PATH_MAX))
> + return NULL;
> + dest = __rawmemchr (rpath, '\0');
> + }
> + else
> + {
> + rpath[0] = '/';
> + dest = rpath + 1;
> + }
> +
> + for (start = end = name; *start; start = end)
> + {
> + struct stat64 st;
> + int n;
> + /* Skip sequence of multiple path-separators. */
> + while (*start == '/')
> + ++start;
> +
> + /* Find end of path component. */
> + for (end = start; *end && *end != '/'; ++end)
> + /* Nothing. */;
> +
> + if (end - start == 0)
> + break;
> + else if (end - start == 1 && start[0] == '.')
> + /* nothing */;
> + else if (end - start == 2 && start[0] == '.' && start[1] == '.')
> + {
> + /* Back up to previous component, ignore if at root already. */
> + if (dest > rpath + 1)
> + while ((--dest)[-1] != '/');
> + }
> + else
> + {
> + if (dest[-1] != '/')
> + *dest++ = '/';
> +
> + if (dest + (end - start) >= rpath + PATH_MAX)
> + return NULL;
> +
> + dest = __mempcpy (dest, start, end - start);
> + *dest = '\0';
> + if (__lxstat64 (_STAT_VER, rpath, &st) < 0)
> + return NULL;
> +
> + if (S_ISLNK (st.st_mode))
> + {
> + char buf[PATH_MAX];
> + size_t len;
> +
> + INTERNAL_SYSCALL_DECL (err);
> +
> + n = INTERNAL_SYSCALL (readlink, err, 3, rpath, buf,
> + PATH_MAX - 1);
> +
> + if (n <= 0 || buf[0] == '[')
> + return NULL;
> + buf[n] = '\0';
> +
> + len = strlen (end);
> + if ((long int) (n + len) >= PATH_MAX)
> + return NULL;
> +
> + /* Careful here, end may be a pointer into extra_buf... */
> + memmove (&extra_buf[n], end, len + 1);
> + name = end = memcpy (extra_buf, buf, n);
> +
> + if (buf[0] == '/')
> + dest = rpath + 1; /* Absolute symlink */
> + else
> + /* Back up to previous component, ignore if at root already: */
> + if (dest > rpath + 1)
> + while ((--dest)[-1] != '/');
> + }
> + }
> + }
> + if (dest > rpath + 1 && dest[-1] == '/')
> + --dest;
> + *dest = '\0';
> +
> + return rpath;
> +
> +}
> --
> 2.0.4
>
ping...