Bug 26065

Summary: ld/testsuite/ld-elf symbolic tests dl4e and dl4f fail
Product: binutils Reporter: Fangrui Song <i>
Component: ldAssignee: Alan Modra <amodra>
Status: RESOLVED FIXED    
Severity: normal CC: adhemerval.zanella, drepper.fsp, fweimer, maskray
Priority: P2 Flags: fweimer: security-
Version: unspecified   
Target Milestone: 2.35   
Host: Target: aarch64-*, powerpc64*-*
Build: Last reconfirmed: 2020-06-02 00:00:00

Description Fangrui Song 2020-05-30 05:44:46 UTC
On aarch64,

% make -C ~/Dev/binutils-gdb/Debug check-ld RUNTESTFLAGS='ld-elf/shared.exp'
...
Running /home/maskray/Dev/binutils-gdb/ld/testsuite/ld-elf/shared.exp ...
FAIL: Run with libdl4e.so
FAIL: Run with libdl4f.so


% cat dl4.c 
#include <stdio.h>

int foo1;
int foo2;

extern void xxx1 (void);
extern void xxx2 (void);

void
bar (int x)
{
  if (foo1 == 1)
    printf ("bar OK1\n");
  else if (foo1 == 0)
    printf ("bar OK2\n");
  if (foo2 == 1)
    printf ("bar OK3\n");
  else if (foo2 == 0)
    printf ("bar OK4\n");
  foo1 = -1;
  foo2 = -1;
  xxx1 ();
  xxx2 ();
}

% cat dl4xxx.c
#include <stdio.h>

void
xxx1 (void)
{
  printf ("DSO1\n");
}

void
xxx2 (void)
{
  printf ("DSO2\n");
}


% cd /home/maskray/Dev/binutils-gdb/Debug/ld/tmpdir/ld

tmpdir/libdl4e.so is linked with:

gcc -B/home/maskray/Dev/binutils-gdb/Debug/ld/tmpdir/ld/ -L/usr/local/aarch64-unknown-linux-gnu/lib64 -L/usr/local/lib64 -L/lib64 -L/usr/lib64 -L/usr/local/aarch64-unknown-linux-gnu/lib -L/usr/local/lib -L/lib -L/usr/lib -o tmpdir/libdl4e.so -L/home/maskray/Dev/binutils-gdb/ld/testsuite/ld-elf -shared -Wl,-Bsymbolic-functions,--dynamic-list-cpp-new tmpdir/dl4.o tmpdir/dl4xxx.o

ld emitted .dynsym is correct. There is no R_AARCH64_GLOB_DAT for foo1.
However, at runtime foo1 (in .bss) is somehow changed to 1. I don't understand why this happens.
There is no dynamic relocation relocating foo1 in libdl4e.so

In ld/testsuite/ld-elf/dl4.c, if I change `int foo1;` to `int foo = -1;`. foo1 will reside in `.data`. It is still somehow relocated.

% readelf -Wr tmpdir/libdl4e.so

Relocation section '.rela.dyn' at offset 0x428 contains 9 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000010de0  0000000000000403 R_AARCH64_RELATIVE                        680
0000000000010de8  0000000000000403 R_AARCH64_RELATIVE                        638
0000000000010fc8  0000000000000403 R_AARCH64_RELATIVE                        11024
0000000000010fd8  0000000000000403 R_AARCH64_RELATIVE                        11028
0000000000011018  0000000000000403 R_AARCH64_RELATIVE                        11018
0000000000010fb8  0000000300000401 R_AARCH64_GLOB_DAT     0000000000000000 _ITM_deregisterTMCloneTable + 0
0000000000010fc0  0000000400000401 R_AARCH64_GLOB_DAT     0000000000000000 __cxa_finalize@GLIBC_2.17 + 0
0000000000010fd0  0000000500000401 R_AARCH64_GLOB_DAT     0000000000000000 __gmon_start__ + 0
0000000000010fe0  0000000700000401 R_AARCH64_GLOB_DAT     0000000000000000 _ITM_registerTMCloneTable + 0


GNU ld -Bsymbolic may be wrong as well. If I change -Wl,-Bsymbolic-functions,--dynamic-list-cpp-new to -Bsymbolic, there will be a GLOB_DAT for foo1.
Comment 1 Florian Weimer 2020-06-02 10:40:04 UTC
Please attach the required input files to this bug, with instructions how to build them. We do not have access to your binutils source tree.
Comment 2 Fangrui Song 2020-06-03 05:49:55 UTC
> Please attach the required input files to this bug, with instructions how to build them. We do not have access to your binutils source tree.

binutils-gdb master

You can reproduce with `make check-ld RUNTESTFLAGS='ld-elf/shared.exp'` at any commit after bb68f22c8e648032a0d1c1d17353eec599ff5e6a (2020-05-24). Just reproduced on an aarch64 machine at bb7322c67111024f5977deb85abd777ec713b1a9 (2020-06-02)

The tests work on an x86-64 machine.
Comment 3 Florian Weimer 2020-06-03 07:07:50 UTC
I cannot reproduce this on aarch64:

                === ld Summary ===

# of expected passes            245
# of expected failures          2
/root/binutils-gdb/build/ld/ld-new 2.34.50.20200603

The two XFAILs are:

ld/ld.sum:XFAIL: pr22374 function pointer initialization
ld/ld.sum:XFAIL: Run pr19719 fun undefined
ld/ld.log:XFAIL: pr22374 function pointer initialization
ld/ld.log:XFAIL: Run pr19719 fun undefined
Comment 4 Adhemerval Zanella 2020-06-05 13:58:37 UTC
I can't reproduce it either by running binutils testcase on glibc 2.23 (ubuntu 18, binutils commit 82f06518c463badebdab653a7af4e4427c786742). 

I also tried to extract the testcase and build the testcase to run against glibc master (9b7424215b10ae01d680ef91e10fc10f51227177) by:

---
$ cat dl4.c
#include <stdio.h>

int foo1;
int foo2;

extern void xxx1 (void);
extern void xxx2 (void);

void
bar (int x)
{
  if (foo1 == 1)
    printf ("bar OK1\n");
  else if (foo1 == 0)
    printf ("bar OK2\n");
  if (foo2 == 1)
    printf ("bar OK3\n");
  else if (foo2 == 0)
    printf ("bar OK4\n");
  foo1 = -1;
  foo2 = -1;
  xxx1 ();
  xxx2 ();
}
$ cat dl4xxx.c
#include <stdio.h>

void
xxx1 (void)
{
  printf ("DSO1\n");
}

void
xxx2 (void)
{
  printf ("DSO2\n");
}
$ cat dlmain4.c
#include <stdio.h>

extern int foo1;
extern int foo2;
extern void bar (void);

void
xxx1 (void)
{
  printf ("MAIN1\n");
}

void
xxx2 (void)
{
  printf ("MAIN2\n");
}

int
main (void)
{
  foo1 = 1;
  foo2 = 1;
  bar ();
  if (foo1 == -1)
    printf ("OK1\n");
  else if (foo1 == 1)
    printf ("OK2\n");
  if (foo2 == -1)
    printf ("OK3\n");
  else if (foo2 == 1)
    printf ("OK4\n");
  return 0;
}
$ gcc -B/home/azanella/projects/binutils/install/bin -fPIC -shared -Wl,-Bsymbolic-functions,--dynamic-list-cpp-new -o dl4.c dl4xxx.c -o libdl4e.so
$ gcc -B/home/azanella/projects/binutils/install/bin -L. -Wl,--no-as-needed libdl4e.so dlmain4.c -o dlmain4
--

And glibc master did not show any issue as well. Do you have more information on to setup the environment to reproduce this?
Comment 5 Alan Modra 2020-06-09 06:56:51 UTC
After binutils commit bb68f22c8e the tests fail on powerpc64le-linux too.  As far as I can tell, the testcase doesn't really have a correct answer.  If linking against a -Bsymbolic or --dynamic-list library results in two copies of foo1 and foo2, one in the main program and one in the shared library, then you get the "OK2, OK4" results.  If linking against such a library only results in one copy of foo1 and foo2 (just in the library) then you get the "OK1, OK3" results.  What you get depends on default compiler behaviour for your target.  On powerpc64le-linux we always generate PIC, and that allows the dl4main.c to use the shared library foo1 and foo2.

Even on x86_64:
$ gcc -o tmpdir/dl4main.o -fno-PIE -O2 -c dl4main.c
$ gcc -o tmpdir/dl4e -no-pie -B tmpdir/ld/ tmpdir/dl4main.o tmpdir/libdl4e.so
$ tmpdir/dl4e
bar OK2
bar OK4
DSO1
DSO2
OK2
OK4
$ gcc -o tmpdir/dl4main.o -fPIE -O2 -c dl4main.c
$ gcc -o tmpdir/dl4e -pie -B tmpdir/ld/ tmpdir/dl4main.o tmpdir/libdl4e.so
$ tmpdir/dl4e
bar OK2
bar OK4
DSO1
DSO2
OK2
OK4
$ gcc -o tmpdir/dl4main.o -fPIC -O2 -c dl4main.c
$ gcc -o tmpdir/dl4e -pie -B tmpdir/ld/ tmpdir/dl4main.o tmpdir/libdl4e.so
$ tmpdir/dl4e
bar OK1
bar OK3
DSO1
DSO2
OK1
OK3

I'm leaning towards compiling dl4main.c -fPIC to resolve this bug.
Comment 6 Alan Modra 2020-06-09 07:04:41 UTC
And of course, "two copies of foo1 and foo2" come about due to that horrible old hack of duplicating shared library variables in the executable .dynbss, with initialization via copy relocs.
Comment 7 Sourceware Commits 2020-06-09 08:05:12 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=a61e306070a182315216e72c8ba53efa6a247814

commit a61e306070a182315216e72c8ba53efa6a247814
Author: Alan Modra <amodra@gmail.com>
Date:   Tue Jun 9 17:02:12 2020 +0930

    PR26065, ld/testsuite/ld-elf symbolic tests dl4e and dl4f fail
    
            PR 26065
            * testsuite/ld-elf/shared.exp: Compile dl4main.c -fPIC.
            (dl4e, dl4f): Expect dl4a.out.
            * testsuite/ld-elf/dl4e.out: Delete.
Comment 8 Alan Modra 2020-06-09 08:34:10 UTC
Should now be fixed
Comment 9 Fangrui Song 2020-06-09 16:23:32 UTC
Thanks. The tests failed due to --enable-default-pie and whether the arch uses GOT for -fpie => which affects whether copy relocation is produced.

aarch64 -fno-pic: bar OK2
aarch64 -fpie or -fpic: bar OK1
x86 -fno-pic or -fpie: bar OK2
x86 -fpic: bar OK1