The recent patch for bug 20019 causes an unfortunate effect when a shared library that calls longjmp (but does not use libpthread) is linked into a program that links -lpthread. As a slight variation on the testcase for that bug: $ cat main.c void foo (void); int main () { foo (); return 0; } $ cat foo.c void bar (void *dst, void *src); void foo (void) { char dst[50]; char src[50]; bar (dst, src); } $ cat bar.c #include <setjmp.h> void bar (void *dst, void *src) { jmp_buf j; longjmp(j,1); } $ gcc -O2 -c -o main.o main.c $ gcc -O2 -fPIC -c -o foo.o foo.c $ gcc -O2 -fPIC -c -o bar.o bar.c $ ld.gold -shared -z now -o libbar.so bar.o $ ld.gold -shared -z now -o libfoo.so foo.o libbar.so $ gcc -o foo main.o libfoo.so -Wl,-rpath,. -lpthread $ ./foo ./foo: Relink `./libbar.so' with `libpthread.so.0' for IFUNC symbol `longjmp'
The bug is here: $ ld.gold -shared -z now -o libbar.so bar.o You must link against the relevant glibc DSOs if you use symbols from glibc, otherwise the resulting shared object is undefined. If you don't do that, your DSO will not be future-proof, and you can run into issues like bug 20489 after a glibc upgrade.
(In reply to Florian Weimer from comment #1) > The bug is here: > > $ ld.gold -shared -z now -o libbar.so bar.o > > You must link against the relevant glibc DSOs if you use symbols from glibc, > otherwise the resulting shared object is undefined. Sorry, that was a poor testcase. However, linking -lc at this point doesn't make any difference. Linking -lpthread does, but the point is that libbar doesn't use any symbols from libpthread itself and has no reason to link with that library.
(In reply to Phil Blundell from comment #2) > (In reply to Florian Weimer from comment #1) > > The bug is here: > > > > $ ld.gold -shared -z now -o libbar.so bar.o > > > > You must link against the relevant glibc DSOs if you use symbols from glibc, > > otherwise the resulting shared object is undefined. > > Sorry, that was a poor testcase. However, linking -lc at this point doesn't > make any difference. And linking with gcc (without specifying -lc) also reproduces this issue. > Linking -lpthread does, but the point is that libbar > doesn't use any symbols from libpthread itself and has no reason to link > with that library. This is actually a very nice test case, thanks. The “fix” for bug 20019 is clearly bogus and needs to be reverted.
(In reply to Florian Weimer from comment #3) > > The “fix” for bug 20019 is clearly bogus and needs to be reverted. We want to relocate libc.so before perform relocation against IFUNC function defined in libc.so from other shared objects. We can either revert the fix for PR 20019 or restrict the check to libc.so.
Created attachment 9750 [details] Something like this
(In reply to H.J. Lu from comment #5) > Created attachment 9750 [details] > Something like this I take this back. Does this testcase ever work after commit ec2a88b3c659ad4f5a662ca289edae4f0dc19d88 Author: Roland McGrath <roland@hack.frob.com> Date: Fri Feb 6 10:53:27 2015 -0800 Clean up NPTL longjmp to be compat-only. I got (gdb) r Starting program: /export/home/hjl/bugs/libc/pr21041/main-dynamic warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available. Program received signal SIGSEGV, Segmentation fault. 0x0000000000000000 in ?? () (gdb) bt #0 0x0000000000000000 in ?? () #1 0x00007ffff741928e in bar () #2 0x00007ffff7bda2de in foo () from ./libfoo.so #3 0x00000000004005f9 in main () (gdb) The problem is (gdb) disass longjmp_resolve Dump of assembler code for function longjmp_resolve: 0x00007ffff79cc350 <+0>: mov 0x208c91(%rip),%rax # 0x7ffff7bd4fe8 0x00007ffff79cc357 <+7>: retq which returns the address after relocating libpthread.so with 000000217fe8 005400000006 R_X86_64_GLOB_DAT 0000000000000000 __libc_longjmp@GLIBC_PRIVATE + 0 This is the same problem which my fix for PR 20019 tries to detect.
Here is an example: [hjl@gnu-6 pr21041b]$ cat bar.c extern void xxx (void); void bar (void) { xxx (); } [hjl@gnu-6 pr21041b]$ cat foo.c extern void bar (void); void foo (void) { bar (); } [hjl@gnu-6 pr21041b]$ cat ifunc.c extern void real_xxx (void); static void * xxx_resolver () { return &real_xxx; } extern void xxx (void) __attribute__ ((ifunc ("xxx_resolver"))); [hjl@gnu-6 pr21041b]$ cat xxx.c void xxx (void) { } __typeof (xxx) real_xxx __attribute__ ((alias("xxx"))); [hjl@gnu-6 pr21041b]$ cat main.c void foo (void); int main () { foo (); return 0; } [hjl@gnu-6 pr21041b]$ make gcc -O2 -c -o main.o main.c gcc -O2 -fPIC -c -o foo.o foo.c gcc -O2 -fPIC -c -o bar.o bar.c ld.bfd -shared -z now -o libbar.so bar.o ld.bfd -shared -z now -o libfoo.so foo.o libbar.so gcc -O2 -fPIC -c -o ifunc.o ifunc.c gcc -O2 -c -o xxx.o xxx.c ld.bfd -shared -o libxxx.so xxx.o ld.bfd -shared -o libifunc.so ifunc.o libxxx.so gcc -o x main.o libfoo.so libifunc.so libxxx.so -Wl,-rpath,. gcc -o y main.o libfoo.so libxxx.so -Wl,-rpath,. ./y ./x Makefile:10: recipe for target 'all' failed make: *** [all] Segmentation fault [hjl@gnu-6 pr21041b]$
(In reply to H.J. Lu from comment #6) > I take this back. Does this testcase ever work after > > commit ec2a88b3c659ad4f5a662ca289edae4f0dc19d88 > Author: Roland McGrath <roland@hack.frob.com> > Date: Fri Feb 6 10:53:27 2015 -0800 > > Clean up NPTL longjmp to be compat-only. > I just did a superficial test with glibc 2.24 and I think you may be right that it is broken there too. In a real application the failure is less obvious prior to your patch because it only crashes when longjmp() is actually called (which may happen rarely or never) whereas with your patch installed the binary won't even start. But it looks as though it may not have worked correctly for some time.
(In reply to Phil Blundell from comment #8) > (In reply to H.J. Lu from comment #6) > > I take this back. Does this testcase ever work after > > > > commit ec2a88b3c659ad4f5a662ca289edae4f0dc19d88 > > Author: Roland McGrath <roland@hack.frob.com> > > Date: Fri Feb 6 10:53:27 2015 -0800 > > > > Clean up NPTL longjmp to be compat-only. > > > > I just did a superficial test with glibc 2.24 and I think you may be right > that it is broken there too. In a real application the failure is less > obvious prior to your patch because it only crashes when longjmp() is > actually called (which may happen rarely or never) whereas with your patch > installed the binary won't even start. But it looks as though it may not > have worked correctly for some time. My fix for PR 20019 prevents broken applications from start. It is unsafe for IFUNC resolver to reference external symbols as shown here.
Can we simply remove setjmp/longjmp from libpthread.so?
Created attachment 9762 [details] A script to check versions of IFUNC symbols in libpthread.so and libc.so We can remove the IFUNC symbols in libpthread.so if those symbols in libc.so have the same versions in libc.so and another symbol in libpthread.so has the same version so that removing the IFUNC symbols in libpthread.so won't change version dependencies for libc.so and libpthread.so. Here is the script to check them.
(In reply to H.J. Lu from comment #10) > Can we simply remove setjmp/longjmp from libpthread.so? I tried this before, but it does not work with the current dynamic linker: https://sourceware.org/ml/libc-alpha/2016-05/msg00176.html I never received a clarification if this behavior is required, or a dynamic linker bug.
(In reply to Florian Weimer from comment #12) > (In reply to H.J. Lu from comment #10) > > Can we simply remove setjmp/longjmp from libpthread.so? > > I tried this before, but it does not work with the current dynamic linker: > > https://sourceware.org/ml/libc-alpha/2016-05/msg00176.html > > I never received a clarification if this behavior is required, or a dynamic > linker bug. Do you have a testcase?
Created attachment 9776 [details] Patch to remove compatibility symbols from libpthread (In reply to H.J. Lu from comment #13) > (In reply to Florian Weimer from comment #12) > > (In reply to H.J. Lu from comment #10) > > > Can we simply remove setjmp/longjmp from libpthread.so? > > > > I tried this before, but it does not work with the current dynamic linker: > > > > https://sourceware.org/ml/libc-alpha/2016-05/msg00176.html > > > > I never received a clarification if this behavior is required, or a dynamic > > linker bug. > > Do you have a testcase? Compile glibc with the attached patch and install it. The system will no longer boot because of errors like this one: # /sbin/init /sbin/init: relocation error: /usr/lib/systemd/libsystemd-shared-232.so: symbol write, version GLIBC_2.2.5 not defined in file libpthread.so.0 with link time reference I'm not sure if this link failure is a dynamic linker bug. The symbol versioning specification is ambiguous whether the soname in the version counts for symbol resolution.
(In reply to Florian Weimer from comment #14) > Created attachment 9776 [details] > Patch to remove compatibility symbols from libpthread > > (In reply to H.J. Lu from comment #13) > > (In reply to Florian Weimer from comment #12) > > > (In reply to H.J. Lu from comment #10) > > > > Can we simply remove setjmp/longjmp from libpthread.so? > > > > > > I tried this before, but it does not work with the current dynamic linker: > > > > > > https://sourceware.org/ml/libc-alpha/2016-05/msg00176.html > > > > > > I never received a clarification if this behavior is required, or a dynamic > > > linker bug. > > > > Do you have a testcase? > > Compile glibc with the attached patch and install it. The system will no > longer boot because of errors like this one: > > # /sbin/init > /sbin/init: relocation error: /usr/lib/systemd/libsystemd-shared-232.so: > symbol write, version GLIBC_2.2.5 not defined in file libpthread.so.0 with > link time reference > > I'm not sure if this link failure is a dynamic linker bug. The symbol > versioning specification is ambiguous whether the soname in the version > counts for symbol resolution. I've always wondered about this, and the truth is I consider the binding we have today to be overspecified. I think that any write@@GLIBC_2.2.5 should be able to resolve write@GLIBC_2.2.5, not just he one that came from the library that provided the original link-time reference (encoded in .gnu.version_r). However, Ulrich's original design seems to explicitly call this out: ~~~ This has to be done by walking through all the entries of the loading object's Elfxx_Verneed array and examining whether the loaded object's Elfxx_Verdef array contains an appropriate version definition. Both, the version definition and the version requirement, are identified by strings. In both cases the structures contain beside the string (or string references) also ELF hashing values which can be compared before making the actual string comparison. ~~~ So there is an implicit binding of <symbol> + <version> to <soname> that happens. The only rationale given is: ~~~ This way it is possible to recognize libraries which are too old and don't contain all the symbols or contain incompatible implementations. Without this kind of test one could end up with runtime errors which don't provide helpful information. ~~~ It also means you can never move the versioned symbol to another shared object.
Created attachment 9782 [details] A testcase [hjl@gnu-6 pr21041d]$ make gcc -O2 -c -o main.o main.c gcc -O2 -fPIC -c -o foo.o foo.c gcc -O2 -c -o bar1.o bar1.c gcc -shared -Wl,--soname=libbar.so -o libbar1.so bar1.o -Wl,--version-script=libbar.map gcc -shared -o libfoo.so foo.o libbar1.so gcc -O2 -c -o xxx.o xxx.c gcc -shared -o libxxx.so xxx.o -Wl,--version-script=libbar.map gcc -o x main.o libfoo.so libbar1.so libxxx.so -Wl,-rpath,. gcc -O2 -c -o bar2.o bar2.c gcc -shared -Wl,--soname=libbar.so -o libbar2.so bar2.o -Wl,--version-script=libbar.map ln -sf libbar1.so libbar.so ./x ln -sf libbar2.so libbar.so ./x ./x: relocation error: ./libfoo.so: symbol bar1, version BAR not defined in file libbar.so with link time reference Makefile:12: recipe for target 'all' failed make: *** [all] Error 127 [hjl@gnu-6 pr21041d]$
But [hjl@gnu-6 pr21041d]$ ./x ./x: relocation error: ./libfoo.so: symbol bar1, version BAR not defined in file libbar.so with link time reference [hjl@gnu-6 pr21041d]$ LD_PRELOAD=./libxxx.so ./x [hjl@gnu-6 pr21041d]$
Also [hjl@gnu-6 pr21041d]$ gcc -o y main.o libfoo.so libxxx.so libbar1.so -Wl,-rpath,. [hjl@gnu-6 pr21041d]$ ./y [hjl@gnu-6 pr21041d]$ The difference is ld.so checks libxxx.so before libbar1.so.
dl-lookup.c has skip: /* If this current map is the one mentioned in the verneed entry and we have not found a weak entry, it is a bug. */ if (symidx == STN_UNDEF && version != NULL && version->filename != NULL && __glibc_unlikely (_dl_name_match_p (version->filename, map))) return -1; But there is no requirement for that the symbol definition at run-time must come from the same shared object at link-time.
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, master has been updated via fc5ad7024c620cdfe9b76e94638aac83b99c5bf8 (commit) from 852d63120783fae5bf85a067320dc4ba1ed59f11 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=fc5ad7024c620cdfe9b76e94638aac83b99c5bf8 commit fc5ad7024c620cdfe9b76e94638aac83b99c5bf8 Author: Andreas Schwab <schwab@suse.de> Date: Tue Aug 8 16:21:58 2017 +0200 Don't use IFUNC resolver for longjmp or system in libpthread (bug 21041) Unlike the vfork forwarder and like the fork forwarder as in bug 19861, there won't be a problem when the compiler does not turn this into a tail call. ----------------------------------------------------------------------- Summary of changes: ChangeLog | 6 ++++++ nptl/pt-longjmp.c | 31 ++++++++++--------------------- nptl/pt-system.c | 24 ++++++++---------------- 3 files changed, 24 insertions(+), 37 deletions(-)
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, master has been updated via 5797b410a87f6f6f6d3661d730fac320cbd5f270 (commit) via 40c06a3d0450365e9675fe26f34fc56ce8497325 (commit) from 0e02b5107e17830d19e83cb2208103f79666af31 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5797b410a87f6f6f6d3661d730fac320cbd5f270 commit 5797b410a87f6f6f6d3661d730fac320cbd5f270 Author: Andreas Schwab <schwab@suse.de> Date: Wed Aug 9 10:36:08 2017 +0200 Fix s390 version of pt-longjmp.c https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=40c06a3d0450365e9675fe26f34fc56ce8497325 commit 40c06a3d0450365e9675fe26f34fc56ce8497325 Author: Andreas Schwab <schwab@suse.de> Date: Tue Aug 8 17:44:32 2017 +0200 Add test for bug 21041 ----------------------------------------------------------------------- Summary of changes: ChangeLog | 15 +++++++++++++ nptl/Makefile | 6 +++- .../lcong48.c => nptl/tst-compat-forwarder-mod.c | 10 +++++--- .../tst-compat-forwarder.c | 23 ++++++++----------- sysdeps/unix/sysv/linux/s390/pt-longjmp.c | 4 +- 5 files changed, 37 insertions(+), 21 deletions(-) copy stdlib/lcong48.c => nptl/tst-compat-forwarder-mod.c (77%) copy math/test-fe-snans-always-signal.c => nptl/tst-compat-forwarder.c (67%)
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, release/2.26/master has been updated via 71170eba2af41e08d51cf9d7b1ded5fd4b0b5c9c (commit) via 4db8f362c13c7239311db95bd7f96d4bce0769f3 (commit) via 88758c4ad3f046d050bc2c3ae0f172b6524ca6c2 (commit) from 6850e9c6bad862a1b982f456096c54946c2aaeab (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=71170eba2af41e08d51cf9d7b1ded5fd4b0b5c9c commit 71170eba2af41e08d51cf9d7b1ded5fd4b0b5c9c Author: Andreas Schwab <schwab@suse.de> Date: Tue Aug 8 17:44:32 2017 +0200 Add test for bug 21041 (cherry picked from commit 40c06a3d0450365e9675fe26f34fc56ce8497325) https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=4db8f362c13c7239311db95bd7f96d4bce0769f3 commit 4db8f362c13c7239311db95bd7f96d4bce0769f3 Author: Andreas Schwab <schwab@suse.de> Date: Wed Aug 9 10:36:08 2017 +0200 Fix s390 version of pt-longjmp.c (cherry picked from commit 5797b410a87f6f6f6d3661d730fac320cbd5f270) https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=88758c4ad3f046d050bc2c3ae0f172b6524ca6c2 commit 88758c4ad3f046d050bc2c3ae0f172b6524ca6c2 Author: Andreas Schwab <schwab@suse.de> Date: Tue Aug 8 16:21:58 2017 +0200 Don't use IFUNC resolver for longjmp or system in libpthread (bug 21041) Unlike the vfork forwarder and like the fork forwarder as in bug 19861, there won't be a problem when the compiler does not turn this into a tail call. (cherry picked from commit fc5ad7024c620cdfe9b76e94638aac83b99c5bf8) ----------------------------------------------------------------------- Summary of changes: ChangeLog | 21 +++++++++++++ nptl/Makefile | 6 ++- nptl/pt-longjmp.c | 31 ++++++------------- nptl/pt-system.c | 24 +++++---------- .../lcong48.c => nptl/tst-compat-forwarder-mod.c | 10 ++++-- .../tst-compat-forwarder.c | 23 ++++++-------- sysdeps/unix/sysv/linux/s390/pt-longjmp.c | 4 +- 7 files changed, 61 insertions(+), 58 deletions(-) copy stdlib/lcong48.c => nptl/tst-compat-forwarder-mod.c (77%) copy math/test-fe-snans-always-signal.c => nptl/tst-compat-forwarder.c (67%)
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, release/2.25/master has been updated via 0e6f64d9d04a9a26a048d1fb1bf1ebdb6739c5af (commit) via 06e775f4646f823cb858f50258bbe5a6f1e6bbe1 (commit) via 8182ccd9b8f9e6474712ddc904930e150f96da36 (commit) from 46acbd0582ce0c1b661e1b43f8e783daeea6ed9a (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=0e6f64d9d04a9a26a048d1fb1bf1ebdb6739c5af commit 0e6f64d9d04a9a26a048d1fb1bf1ebdb6739c5af Author: Andreas Schwab <schwab@suse.de> Date: Tue Aug 8 17:44:32 2017 +0200 Add test for bug 21041 (cherry picked from commit 40c06a3d0450365e9675fe26f34fc56ce8497325) https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=06e775f4646f823cb858f50258bbe5a6f1e6bbe1 commit 06e775f4646f823cb858f50258bbe5a6f1e6bbe1 Author: Andreas Schwab <schwab@suse.de> Date: Wed Aug 9 10:36:08 2017 +0200 Fix s390 version of pt-longjmp.c (cherry picked from commit 5797b410a87f6f6f6d3661d730fac320cbd5f270) https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8182ccd9b8f9e6474712ddc904930e150f96da36 commit 8182ccd9b8f9e6474712ddc904930e150f96da36 Author: Andreas Schwab <schwab@suse.de> Date: Tue Aug 8 16:21:58 2017 +0200 Don't use IFUNC resolver for longjmp or system in libpthread (bug 21041) Unlike the vfork forwarder and like the fork forwarder as in bug 19861, there won't be a problem when the compiler does not turn this into a tail call. (cherry picked from commit fc5ad7024c620cdfe9b76e94638aac83b99c5bf8) ----------------------------------------------------------------------- Summary of changes: ChangeLog | 21 +++++++++++++ nptl/Makefile | 6 ++- nptl/pt-longjmp.c | 31 ++++++------------- nptl/pt-system.c | 24 +++++---------- .../lcong48.c => nptl/tst-compat-forwarder-mod.c | 10 ++++-- .../tst-compat-forwarder.c | 23 ++++++-------- sysdeps/unix/sysv/linux/s390/pt-longjmp.c | 4 +- 7 files changed, 61 insertions(+), 58 deletions(-) copy stdlib/lcong48.c => nptl/tst-compat-forwarder-mod.c (77%) copy math/test-fe-snans-always-signal.c => nptl/tst-compat-forwarder.c (67%)
fork was turned into a regular function as part of bug 19861. The vfork fix needed a dynamic linker change and is covered in bug 20188. There are no such IFUNC resolvers left, so this bug is fixed (but there are opportunities for future cleanup).