Bug 26815

Summary: Unnecessary error on linking STV_PROTECTED library: relocation R_X86_64_PC32 against protected symbol `f' can not be used when making a shared object
Product: binutils Reporter: Thiago Macieira <thiago>
Component: ldAssignee: Not yet assigned to anyone <unassigned>
Status: WAITING ---    
Severity: normal CC: fabian, fweimer, hjl.tools, mliska
Priority: P2    
Version: 2.35   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed: 2020-10-30 00:00:00

Description Thiago Macieira 2020-10-29 20:28:32 UTC
Using ld.bfd:

$ cat test.c
void f() {} 
void *g() { return &f; }
$ gcc -fPIC -o /tmp/a.so -shared -fvisibility=protected test.c
/usr/bin/ld: /tmp/ccbS1isF.o: relocation R_X86_64_PC32 against protected symbol `f' can not be used when making a shared object
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status
$ ld --version
GNU ld (GNU Binutils) 2.35.1.20190203
Copyright (C) 2020 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.

But it works with gold:
$ gcc -fuse-ld=gold -fPIC -o /tmp/a.so -shared -fvisibility=protected test.c
[no error]

It similarly works with ld.bfd if you pass a dummy dynamic list:

$ cat /tmp/empty.dynlist 
{
    this_symbol_doesnt_exist;
};
$ gcc -Wl,--dynamic-list,/tmp/empty.dynlist -fPIC -o /tmp/a.so -shared -fvisibility=protected test.c
[no error]

The library looks fine otherwise (using -nostdlib for output brevity):
$ gcc -nostdlib -Wl,--dynamic-list,/tmp/empty.dynlist -fPIC -o /tmp/a.so -shared -fvisibility=protected test.c
$ readelf -Ds /tmp/a.so

Symbol table for image contains 3 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000001001     8 FUNC    GLOBAL PROTECTED    6 g
     2: 0000000000001000     1 FUNC    GLOBAL PROTECTED    6 f
$ readelf -r /tmp/a.so

There are no relocations in this file.
$ objdump -d /tmp/a.so

/tmp/a.so:     file format elf64-x86-64

Disassembly of section .text:

0000000000001000 <f>:
    1000:       c3                      retq   

0000000000001001 <g>:
    1001:       48 8d 05 f8 ff ff ff    lea    -0x8(%rip),%rax        # 1000 <f>
    1008:       c3                      retq
Comment 1 H.J. Lu 2020-10-30 00:48:26 UTC
Protected function pointer should be treated as normal symbol:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19520

You can override it with linker options:

  -Bsymbolic                  Bind global references locally
  -Bsymbolic-functions        Bind global function references locally
Comment 2 Thiago Macieira 2020-10-30 01:05:29 UTC
(In reply to H.J. Lu from comment #1)
> You can override it with linker options:
> 
>   -Bsymbolic                  Bind global references locally
>   -Bsymbolic-functions        Bind global function references locally

Those options shouldn't be needed. The protected visibility already implies them.

Either way, the fact that gold and ld.bfd behave different is a problem. And so is the fact that passing an empty --dynamic-list file causes it to work.
Comment 3 H.J. Lu 2020-10-30 01:22:10 UTC
(In reply to Thiago Macieira from comment #2)
> (In reply to H.J. Lu from comment #1)
> > You can override it with linker options:
> > 
> >   -Bsymbolic                  Bind global references locally
> >   -Bsymbolic-functions        Bind global function references locally
> 
> Those options shouldn't be needed. The protected visibility already implies
> them.
>

GOT relocation is needed for function pointer comparison.

> Either way, the fact that gold and ld.bfd behave different is a problem. And
> so is the fact that passing an empty --dynamic-list file causes it to work.

An empty --dynamic-list file is the same as -Bsymbolic.
Comment 4 Thiago Macieira 2020-10-30 03:18:00 UTC
(In reply to H.J. Lu from comment #3)
> GOT relocation is needed for function pointer comparison.
> 
> > Either way, the fact that gold and ld.bfd behave different is a problem. And
> > so is the fact that passing an empty --dynamic-list file causes it to work.
> 
> An empty --dynamic-list file is the same as -Bsymbolic.

I see.

Then this is an incompatibility between compiler and linker: the compiler sees a protected symbol and emits a relocation assuming it doesn't need indirect access via the GOT to do a pointer comparison, but ld.bfd says it should.

Anyway, this is what I am asking to change: when a symbol is protected, the compiler optimisation should be allowed and pointer comparison still works provided ALL users use the GOT too. That includes the executable.
Comment 5 H.J. Lu 2020-10-30 03:40:27 UTC
(In reply to Thiago Macieira from comment #4)
> (In reply to H.J. Lu from comment #3)
> > GOT relocation is needed for function pointer comparison.
> > 
> > > Either way, the fact that gold and ld.bfd behave different is a problem. And
> > > so is the fact that passing an empty --dynamic-list file causes it to work.
> > 
> > An empty --dynamic-list file is the same as -Bsymbolic.
> 
> I see.
> 
> Then this is an incompatibility between compiler and linker: the compiler
> sees a protected symbol and emits a relocation assuming it doesn't need
> indirect access via the GOT to do a pointer comparison, but ld.bfd says it
> should.
> 
> Anyway, this is what I am asking to change: when a symbol is protected, the
> compiler optimisation should be allowed and pointer comparison still works
> provided ALL users use the GOT too. That includes the executable.

This is an ABI change.  We have 2 choices

1. Extend GNU_PROPERTY_NO_COPY_ON_PROTECTED to cover protected function
pointers.  We can rename it to GNU_PROPERTY_LOCAL_PROTECTED,  Or
2. Add a new property for protected function pointers.

Please raise the issue at

https://gitlab.com/x86-psABIs/Linux-ABI

This will require support in GCC, linker and ld.so.
Comment 6 Thiago Macieira 2020-10-30 03:43:16 UTC
A bit more testing shows that we'd need a compiler update too, to tell the compiler that a variable should be accessed via the GOT regardless (equiv. to Windows __declspec(dllimport)). We can accomplish that by always compiling everything with -fPIC, though.

lib.c:
long variable = 0;
void *addr()
{
    ++variable;
    return &variable;
}

main.c:
#include <stdio.h>
void *addr(void);

extern long variable;
__auto_type ptr2 = &variable;
int main()
{
    variable++;
    printf("addr() = %p, &variable = %p\n", addr(), ptr2);
}

$ gcc -O2 -shared -fPIC  -fvisibility=protected -fuse-ld=gold -o lib.so lib.c

Now the error comes from Gold:

$ gcc -O2 -pie -fPIE -fuse-ld=gold main.c ./lib.so 
/usr/bin/ld.gold: error: /tmp/ccUGJmAG.o: cannot make copy relocation for protected symbol 'variable', defined in ./lib.so
collect2: error: ld returned 1 exit status

ld.bfd can link it:
$ gcc -O2 -pie -fPIE main.c ./lib.so               

When run, this particular example appears to work but the variable was actually copy-relocated:
$ ./a.out 
addr() = 0x555555558050, &variable = 0x555555558050

When linking the library with ld.bfd instead:
# gcc -O2 -shared -fPIC  -fvisibility=protected -Wl,--dynamic-list,empty.dynlist -o lib.so lib.c

Then we get a mismatch:
$ ./a.out                                                                          
addr() = 0x7ffa8cabc028, &variable = 0x556094d36050

All these problems go away when the main application is compiled with -fPIC.
Comment 7 Thiago Macieira 2022-05-23 03:40:05 UTC
Disabling the use of the feature in Qt 6.4 because it's still broken.
Comment 8 Thiago Macieira 2022-05-23 03:43:10 UTC
Oops, these tests aren't using -mno-direct-extern-access at all. When using them:

$ gcc -O2 -shared -fPIC  -mno-direct-extern-access -fvisibility=protected -Wl,--dynamic-list,empty.dynlist -o lib.so lib.c
$ gcc -mno-direct-extern-access -O2 -pie -fPIE main.c ./lib.so
$ ./a.out 
addr() = 0x7f932637a028, &variable = 0x7f932637a028

So this is actually working right.
Comment 9 H.J. Lu 2022-05-24 22:10:43 UTC
Should this be closed by -mno-direct-extern-access change in GCC?
Comment 10 Thiago Macieira 2022-05-24 23:28:21 UTC
(In reply to H.J. Lu from comment #9)
> Should this be closed by -mno-direct-extern-access change in GCC?

Yes.

making that the default and getting rid of copy relocations would be better, though.