Bug 16467 - ld fails with internal error in elf_x86_64_relocate_section when .symver and ifunc magic is used
Summary: ld fails with internal error in elf_x86_64_relocate_section when .symver and ...
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.24
: P2 normal
Target Milestone: 2.24
Assignee: Alan Modra
URL:
Keywords:
: 16477 16627 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-01-19 03:38 UTC by Zbigniew Jędrzejewski-Szmek
Modified: 2016-06-10 11:25 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
A patch (517 bytes, patch)
2014-01-19 18:39 UTC, H.J. Lu
Details | Diff
A glibc patch (336 bytes, patch)
2014-01-19 19:17 UTC, H.J. Lu
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zbigniew Jędrzejewski-Szmek 2014-01-19 03:38:31 UTC
We are merging two libraries into one. I was trying to create a compatibility library to allow existing programs compiled for the old name to continue to work without recompilation (see [1] for background story).

A short reproducer:
--------------------------------------------------
$ tail liba.c libb.c liba.sym
==> liba.c <==
int sd_get_seats(void) {return 0;}

==> libb.c <==
void new_sd_get_seats(void);
__asm__(".symver new_sd_get_seats,sd_get_seats@LIBSYSTEMD_209");
static void (*resolve_sd_get_seats(void)) (void) {
        return new_sd_get_seats;
}
void sd_get_seats(void) __attribute__((ifunc("resolve_sd_get_seats")));

==> liba.sym <==
LIBSYSTEMD_209 {
global:
        sd_get_seats;
};

$ gcc -shared -o liba.so liba.c -Wl,--version-script=liba.sym
$ gcc -shared -fPIC libb.c -o libb.so -L. -la
/usr/bin/ld: BFD (GNU Binutils for Debian) 2.24 internal error, aborting at ../../bfd/elf64-x86-64.c line 3363 in elf_x86_64_relocate_section

/usr/bin/ld: Please report this bug.

collect2: error: ld returned 1 exit status
--------------------------------------------------

Running the same with -Wl,-fuse-ld=gold works fine.

[1] https://www.mail-archive.com/systemd-devel@lists.freedesktop.org/msg16014.html
Comment 1 H.J. Lu 2014-01-19 15:27:54 UTC
The problem is we are merging IFUNC symbol, sd_get_seats, with
the undefined versioned reference, sd_get_seats@LIBSYSTEMD_2.
Comment 2 H.J. Lu 2014-01-19 16:34:18 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #0)
> 
> Running the same with -Wl,-fuse-ld=gold works fine.
> 

Can you double check if gold generates the working output?
I saw

[hjl@gnu-tools-1 pr16467]$ readelf -rW libb.o

Relocation section '.rela.text' at offset 0x590 contains 1 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000007  0000000900000009 R_X86_64_GOTPCREL      0000000000000000 sd_get_seats@LIBSYSTEMD_209 - 4

Relocation section '.rela.eh_frame' at offset 0x5a8 contains 1 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000020  0000000200000002 R_X86_64_PC32          0000000000000000 .text + 0
[hjl@gnu-tools-1 pr16467]$ readelf -rW libgold.so

Relocation section '.rela.dyn' at offset 0x228 contains 1 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000001420  0000000100000006 R_X86_64_GLOB_DAT      sd_get_seats()   sd_get_seats + 0

Relocation section '.rela.plt' at offset 0x240 contains 1 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000001440  0000000100000007 R_X86_64_JUMP_SLOT     sd_get_seats()   sd_get_seats + 0
[hjl@gnu-tools-1 pr16467]$ 

Where is run-time relocation against sd_get_seats@LIBSYSTEM_209?
Comment 3 H.J. Lu 2014-01-19 18:39:13 UTC
Created attachment 7363 [details]
A patch

Please give this patch a try.
Comment 4 H.J. Lu 2014-01-19 19:17:14 UTC
Created attachment 7364 [details]
A glibc patch

You may also need this glibc patch to avoid matching IFUNC sd_get_seats
with non-IFUNC sd_get_seats@LIBSYSTEMD_209.
Comment 5 H.J. Lu 2014-01-19 19:24:59 UTC
(In reply to H.J. Lu from comment #4)
> Created attachment 7364 [details]
> A glibc patch
> 
> You may also need this glibc patch to avoid matching IFUNC sd_get_seats
> with non-IFUNC sd_get_seats@LIBSYSTEMD_209.

It doesn't work.
Comment 6 Zbigniew Jędrzejewski-Szmek 2014-01-19 23:15:19 UTC
Thanks for the incredibly fast response.

I had a bit of trouble compiling binutils from source (ld would complain about not being compiled to use in a sysroot). In the end I applied your patch (the first one) on top of Fedora's binutils-2.24-9.fc21. Linking still fails in exactly the same way as before.
Comment 7 Zbigniew Jędrzejewski-Szmek 2014-01-19 23:23:18 UTC
> Can you double check if gold generates the working output?
It seemed to be working for me before, in the sense that at least linking worked without any error report, and the binary linking the symbols in the equivalent of libb in the example above was running successfully. I didn't actually test whether the functions can be correctly called. Updated recipe that I paste below does that. I use versioned symbols for libb.so too. Without that, resulting binary segfaults when it is run.
Also, I had to added -fPIC to the compilation of liba.so. Previously I tested with both gcc-4.7 and gcc-4.8, and the old gcc did not complain about missing -fPIC, the newer one does. The guess the newer one is right.

--------------------------------------------------
==> liba.c <==
const char* sd_get_seats(void) {return "bla bla";}

==> liba.sym <==
LIBSYSTEMD_209 {
global:
        sd_get_seats;
};

==> libb.c <==
void new_sd_get_seats(void);
__asm__(".symver new_sd_get_seats,sd_get_seats@LIBSYSTEMD_209");
static void (*resolve_sd_get_seats(void)) (void) {
        return new_sd_get_seats;
}
void sd_get_seats(void) __attribute__((ifunc("resolve_sd_get_seats")));

==> libb.sym <==
LIBSYSTEMD_208 {
global:
        sd_get_seats;
};

==> test1.c <==
#include <stdio.h>
const char* sd_get_seats(void);

int main(int argc, char **argv) {
        printf("%s\n", sd_get_seats());
        return 0;
}
$ gcc -shared -o liba.so liba.c -Wl,--version-script=liba.sym -Wl,-fuse-ld=gold -fPIC        
$ gcc -shared -fPIC libb.c -o libb.so -L. -la -Wl,-fuse-ld=gold -Wl,--version-script=libb.sym
$ gcc -L. -lb -Wl,-fuse-ld=gold test1.c -o test1                                             
$ LD_LIBRARY_PATH=. ./test1
bla bla

$ LD_LIBRARY_PATH=. ldd ./test1
        linux-vdso.so.1 (0x00007fff81963000)
        libb.so => ./libb.so (0x00007faf96c52000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faf96872000)
        liba.so => ./liba.so (0x00007faf9686f000)
        /lib64/ld-linux-x86-64.so.2 (0x00007faf96c56000)
Comment 8 H.J. Lu 2014-01-20 00:46:06 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #7)
> > Can you double check if gold generates the working output?
> It seemed to be working for me before, in the sense that at least linking
> worked without any error report, and the binary linking the symbols in the
> equivalent of libb in the example above was running successfully. I didn't
> actually test whether the functions can be correctly called. Updated recipe
> that I paste below does that. I use versioned symbols for libb.so too.
> Without that, resulting binary segfaults when it is run.
> Also, I had to added -fPIC to the compilation of liba.so. Previously I
> tested with both gcc-4.7 and gcc-4.8, and the old gcc did not complain about
> missing -fPIC, the newer one does. The guess the newer one is right.
> 
> --------------------------------------------------
> ==> liba.c <==
> const char* sd_get_seats(void) {return "bla bla";}
> 
> ==> liba.sym <==
> LIBSYSTEMD_209 {
> global:
>         sd_get_seats;
> };
> 
> ==> libb.c <==
> void new_sd_get_seats(void);
> __asm__(".symver new_sd_get_seats,sd_get_seats@LIBSYSTEMD_209");
> static void (*resolve_sd_get_seats(void)) (void) {
>         return new_sd_get_seats;
> }
> void sd_get_seats(void) __attribute__((ifunc("resolve_sd_get_seats")));
> 
> ==> libb.sym <==
> LIBSYSTEMD_208 {
> global:
>         sd_get_seats;
> };
> 
> ==> test1.c <==
> #include <stdio.h>
> const char* sd_get_seats(void);
> 
> int main(int argc, char **argv) {
>         printf("%s\n", sd_get_seats());
>         return 0;
> }
> $ gcc -shared -o liba.so liba.c -Wl,--version-script=liba.sym
> -Wl,-fuse-ld=gold -fPIC        
> $ gcc -shared -fPIC libb.c -o libb.so -L. -la -Wl,-fuse-ld=gold
> -Wl,--version-script=libb.sym
> $ gcc -L. -lb -Wl,-fuse-ld=gold test1.c -o test1                            
> 
> $ LD_LIBRARY_PATH=. ./test1
> bla bla
> 
> $ LD_LIBRARY_PATH=. ldd ./test1
>         linux-vdso.so.1 (0x00007fff81963000)
>         libb.so => ./libb.so (0x00007faf96c52000)
>         libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faf96872000)
>         liba.so => ./liba.so (0x00007faf9686f000)
>         /lib64/ld-linux-x86-64.so.2 (0x00007faf96c56000)

This is a different testcase since there is an additional
-Wl,--version-script=libb.sym.  My patch works correctly.
Please make sure that you are using the new linker with
my patch against the current binutils master branch:

[hjl@gnu-tools-1 pr16467]$ make
gcc -B./ -fPIC -g   -c -o test1.o test1.c
gcc -B./ -fPIC -g   -c -o libb.o libb.c
gcc -B./ -fPIC -g   -c -o liba.o liba.c
gcc -B./  -shared -o liba.so liba.o -Wl,--version-script=liba.sym
gcc -B./  -shared -o libb.so libb.o liba.so -Wl,--version-script=libb.sym
gcc -B./  -o test1 test1.o libb.so liba.so -Wl,-rpath,. #-Wl,-z,now
./test1
bla bla
[hjl@gnu-tools-1 pr16467]$ gcc -B./  -o test1 test1.o libb.so liba.so -Wl,-rpath,. -Wl,-v
collect2 version 4.8.2 20140115 (Red Hat 4.8.2-11)
./ld --build-id --no-add-needed --eh-frame-hdr --hash-style=gnu -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o test1 /usr/lib/gcc/x86_64-redhat-linux/4.8.2/../../../../lib64/crt1.o /usr/lib/gcc/x86_64-redhat-linux/4.8.2/../../../../lib64/crti.o /usr/lib/gcc/x86_64-redhat-linux/4.8.2/crtbegin.o -L. -L/usr/lib/gcc/x86_64-redhat-linux/4.8.2 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.2/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.2/../../.. test1.o libb.so liba.so -rpath . -v -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-redhat-linux/4.8.2/crtend.o /usr/lib/gcc/x86_64-redhat-linux/4.8.2/../../../../lib64/crtn.o
GNU ld (GNU Binutils) 2.24.51.20140118
[hjl@gnu-tools-1 pr16467]$
Comment 9 H.J. Lu 2014-01-20 01:35:22 UTC
A patch is posted at

https://sourceware.org/ml/binutils/2014-01/msg00209.html
Comment 10 Zbigniew Jędrzejewski-Szmek 2014-01-20 03:53:48 UTC
I tried again, without success. I still get an internal error with just the ld patch on top of binutils git. Is the glibc patch still necessary?
Comment 11 H.J. Lu 2014-01-20 12:46:42 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #10)
> I tried again, without success. I still get an internal error with just the
> ld patch on top of binutils git. Is the glibc patch still necessary?

Please do

1. Apply my patch.
2. Build binutils.
3. Add -v to the gcc command used to build libb.so and copy all
command line options passed to collect2.
4. Run the new linker by hand:

...../ld/ld-new "paste command line options passed to collect2"
Comment 12 Zbigniew Jędrzejewski-Szmek 2014-01-21 19:14:38 UTC
So, once again (with patched ld installed as /usr/bin/ld):

(gold, a versioned, b unversioned)
$ gcc -shared -o liba.so liba.c -fPIC -Wl,--version-script=liba.sym -Wl,-fuse-ld=gold
$ gcc -shared -o libb.so libb.c -fPIC -L. -la -Wl,-fuse-ld=gold
$ LD_LIBRARY_PATH=. gcc -L. -lb test1.c -o test1 -Wl,-fuse-ld=gold
$ LD_LIBRARY_PATH=. ./test1
Segmentation fault (core dumped)

(gold, a versioned, b versioned)
$ gcc -shared -o liba.so liba.c -fPIC -Wl,--version-script=liba.sym -Wl,-fuse-ld=gold
$ gcc -shared -o libb.so libb.c -fPIC -L. -la -Wl,--version-script=libb.sym -Wl,-fuse-ld=gold
$ LD_LIBRARY_PATH=. gcc -L. -lb test1.c -o test1 -Wl,-fuse-ld=gold
$ LD_LIBRARY_PATH=. ./test1
bla bla

(bfd, a versioned, b versioned)
$ gcc -shared -o liba.so liba.c -fPIC -Wl,--version-script=liba.sym -Wl,-fuse-ld=bfd
$ gcc -shared -o libb.so libb.c -fPIC -L. -la -Wl,--version-script=libb.sym -Wl,-fuse-ld=bfd
$ LD_LIBRARY_PATH=. gcc -L. -lb test1.c -o test1 -Wl,-fuse-ld=bfd
$ LD_LIBRARY_PATH=. ./test1
Segmentation fault (core dumped)

(bfd, a versioned, b versioned)
$ gcc -shared -o liba.so liba.c -fPIC -Wl,--version-script=liba.sym -Wl,-fuse-ld=bfd
$ gcc -shared -o libb.so libb.c -fPIC -L. -la -Wl,--version-script=libb.sym -Wl,-fuse-ld=bfd
$ LD_LIBRARY_PATH=. gcc -L. -lb test1.c -o test1 -Wl,-fuse-ld=bfd
$ LD_LIBRARY_PATH=. ./test1
bla bla

So the unversioned libb fails to produce a working binary with either linker.
With your patch, versioned libb works with both linkers.


I also have more complicated test-case: systemd-logind.so in the systemd project, from which
this testcase was derived. It is versioned and works with your patch. Thanks!
Comment 13 H.J. Lu 2014-01-21 23:53:31 UTC
Fixed on master by:

commit 4584ec12076e088cf36965b88ef8710ca85491f9
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Jan 21 15:42:43 2014 -0800

    Check incompatible existing default symbol definition
    
    After resolving a versioned reference, foo@VER1, to a default versioned
    definition, foo@@VER1, from a shared object, we also merge it with
    the existing regular default symbol definition, foo.  When foo is IFUNC
    and foo@@VER1 aren't, we will merge 2 incompatible definitions.  This
    patch avoids merging foo@@VER1 definition with foo definition if
    one is IFUNC and the other isn't.
Comment 14 H.J. Lu 2014-01-22 01:32:57 UTC
*** Bug 16477 has been marked as a duplicate of this bug. ***
Comment 15 H.J. Lu 2014-02-24 17:27:40 UTC
*** Bug 16627 has been marked as a duplicate of this bug. ***
Comment 16 Sourceware Commits 2016-06-02 03:17:33 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=5b677558bc6c7b2477bb33c709e6017e68e7ae8c

commit 5b677558bc6c7b2477bb33c709e6017e68e7ae8c
Author: Alan Modra <amodra@gmail.com>
Date:   Mon May 30 09:43:44 2016 +0930

    Revert PR16467 change
    
    This reverts the pr16467 change, which was incorrect due to faulty
    analysis of the pr16467 testcase.  The failure was not due to a
    mismatch in symbol type (ifunc/non-ifunc) but due to a symbol loop
    being set up.
    
    See https://sourceware.org/ml/binutils/2016-06/msg00013.html for some
    rambling on versioned symbols and ELF shared library symbol overriding
    that explain this patch.
    
    bfd/
    	PR ld/20159
    	PR ld/16467
    	* elflink.c (_bfd_elf_merge_symbol): Revert PR16467 change.
    	(_bfd_elf_add_default_symbol): Don't indirect to/from defined
    	symbol given a version by a script different to the version
    	of the symbol being added.
    	(elf_link_add_object_symbols): Use _bfd_elf_strtab_save and
    	_bfd_elf_strtab_restore.  Don't fudge dynstr references.
    	* elf-strtab.c (_bfd_elf_strtab_restore_size): Delete.
    	(struct strtab_save): New.
    	(_bfd_elf_strtab_save, _bfd_elf_strtab_restore): New functions.
    	* elf-bfd.h (_bfd_elf_strtab_restore_size): Delete.
    	(_bfd_elf_strtab_save, _bfd_elf_strtab_restore): Declare.
Comment 17 Sourceware Commits 2016-06-09 10:23:42 UTC
The binutils-2_26-branch branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=bdf48af13cdaccdcbcc0aca9c41ae376038508fd

commit bdf48af13cdaccdcbcc0aca9c41ae376038508fd
Author: Alan Modra <amodra@gmail.com>
Date:   Mon May 30 09:43:44 2016 +0930

    Revert PR16467 change
    
    This reverts the pr16467 change, which was incorrect due to faulty
    analysis of the pr16467 testcase.  The failure was not due to a
    mismatch in symbol type (ifunc/non-ifunc) but due to a symbol loop
    being set up.
    
    See https://sourceware.org/ml/binutils/2016-06/msg00013.html for some
    rambling on versioned symbols and ELF shared library symbol overriding
    that explain this patch.
    
    bfd/
    	PR ld/20159
    	PR ld/16467
    	* elflink.c (_bfd_elf_merge_symbol): Revert PR16467 change.
    	(_bfd_elf_add_default_symbol): Don't indirect to/from defined
    	symbol given a version by a script different to the version
    	of the symbol being added.
    	(elf_link_add_object_symbols): Use _bfd_elf_strtab_save and
    	_bfd_elf_strtab_restore.  Don't fudge dynstr references.
    	* elf-strtab.c (_bfd_elf_strtab_restore_size): Delete.
    	(struct strtab_save): New.
    	(_bfd_elf_strtab_save, _bfd_elf_strtab_restore): New functions.
    	* elf-bfd.h (_bfd_elf_strtab_restore_size): Delete.
    	(_bfd_elf_strtab_save, _bfd_elf_strtab_restore): Declare.
Comment 18 Alan Modra 2016-06-10 11:25:41 UTC
Fixed, again