Bug 19803 - gc-sections breaks PE DLL variable export
Summary: gc-sections breaks PE DLL variable export
Status: ASSIGNED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.26
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-03-10 07:43 UTC by martin.koegler
Modified: 2016-04-04 12:46 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
Proposed patch (1.10 KB, patch)
2016-03-15 18:18 UTC, Nick Clifton
Details | Diff
Revised patch (1.44 KB, patch)
2016-03-16 15:28 UTC, Nick Clifton
Details | Diff
Patch V1 (766 bytes, patch)
2016-03-17 21:39 UTC, martin.koegler
Details | Diff
Patch VII (1.61 KB, patch)
2016-03-18 11:37 UTC, Nick Clifton
Details | Diff
Cleaned patch (2.79 KB, patch)
2016-03-18 19:13 UTC, martin.koegler
Details | Diff
patch to add _ prefixed exported names (291 bytes, patch)
2016-03-22 12:29 UTC, Nick Clifton
Details | Diff
Revised patch (1.30 KB, patch)
2016-03-24 10:27 UTC, Nick Clifton
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description martin.koegler 2016-03-10 07:43:02 UTC
--gc-sections is still broken for variables:

d1.cpp:
====
extern const long testval[2] = {1, 2};
====

d2.cpp:
===
#include <stdio.h>
extern const long testval[];
int main()
{
printf("%p\n", testval);
printf("%d %d\n", testval[0], testval[1]);
return 0;
}
====

x86_64-w64-mingw32-g++ -o d1a.dll -shared d1.cpp  -Wl,--out-implib -Wl,d1a.dll.a -Wl,--gc-sections
x86_64-w64-mingw32-g++ -o d1b.dll -shared d1.cpp  -Wl,--out-implib -Wl,d1b.dll.a
x86_64-w64-mingw32-g++ -o d2a.exe d2.cpp d1a.dll.a => Crashes
x86_64-w64-mingw32-g++ -o d2A.exe d2.cpp d1a.dll => prints incorrect data
x86_64-w64-mingw32-g++ -o d2b.exe d2.cpp d1b.dll.a => OK
x86_64-w64-mingw32-g++ -o d2B.exe d2.cpp d1b.dll => OK

See https://sourceware.org/ml/binutils/2016-03/msg00112.html for a test case, which shows this bug in libstdc++.
Comment 1 Nick Clifton 2016-03-10 09:19:20 UTC
Hi Martin,

  This looks like it might be related to:

https://sourceware.org/bugzilla/show_bug.cgi?id=19480

  and/or:

https://sourceware.org/bugzilla/show_bug.cgi?id=19531

  Do either of the proposed patches in those PRs fix your problem ?

Cheers
  Nick
Comment 2 martin.koegler 2016-03-12 20:55:45 UTC
The bug is caused by deleting sections and still exporting their symbols with bogus values:

dx.s:
=============
        .text
        .globl  DllMainCRTStartup
DllMainCRTStartup:
        movl    $1, %eax
        ret
        .globl  testval
        .section .rdata,"dr"
testval:
        .long   1
        .long   2
================
x86_64-w64-mingw32-as  -o dx.o  dx.s
x86_64-w64-mingw32-ld -o dx.dll -shared dx.o  --print-gc-sections  --out-implib  dx.dll.a --gc-sections
Comment 3 Nick Clifton 2016-03-15 18:18:47 UTC
Created attachment 9096 [details]
Proposed patch

Hi Martin,

  Please could you try out this patch.  It is a work in progress, but it might do the job...

Cheers
  Nick
Comment 4 martin.koegler 2016-03-15 19:09:14 UTC
Fixes all my testcases.
Comment 5 Nick Clifton 2016-03-16 15:28:23 UTC
Created attachment 9104 [details]
Revised patch

Hi Martin,

  Oops - sorry - that patch had the unintended effect of completely disabling
  linker garbage collection (for COFF/PE targets).  Doh.

  Please try this revised version instead, which this time, I think really will
  work.

Cheers
  Nick
Comment 6 martin.koegler 2016-03-16 21:56:45 UTC
The last patch is incorrect. In my testcase, it still exports the removed variable in the DLL export table:
[Ordinal/Name Pointer] Table
        [   0] testval

So directly linking with the DLL produces a invalid binary.

May I ask, why you only adapt the implib generation? 
If you make sure, that the export table is correct [every exported symbol is not removed by gc-sections], shouldn't the export library generated from it be automatically correct too?

In my option, your first patch didn't break gc-sections. The windows default is to export all (global?) symbols from a DLL, if no specific export list is provided via a DEF file.
Comment 7 Nick Clifton 2016-03-17 10:30:05 UTC
(In reply to martin.koegler from comment #6)

Hi Martin,

> The last patch is incorrect. In my testcase, it still exports the removed 
> variable in the DLL export table:
> [Ordinal/Name Pointer] Table
>         [   0] testval
>
> So directly linking with the DLL produces a invalid binary.

Darn.  I was hoping changing the class to C_HIDDEN and the section to undefined would be sufficient.

> May I ask, why you only adapt the implib generation? 

Because I am not a COFF/PE expert!  Nobody else seemed to want to work on this problem, at there are several bug reports about it now, so I decided that I had
better have a go.  But as you can see, most of my patches have been based upon guesswork.

> If you make sure, that the export table is correct [every exported symbol is
> not removed by gc-sections], shouldn't the export library generated from it
> be automatically correct too?

True - but it looks to me like the data structures for generating the export table are set up far too early - ie before garbage collection - and the code in the bfd library that performs the garbage collection has no knowledge of, or access to, these data structures.

> In my option, your first patch didn't break gc-sections. The windows default
> is to export all (global?) symbols from a DLL, if no specific export list is
> provided via a DEF file.

Ah - OK, in which case most of the first patch is probably not needed.  Only the change to process_def_file_and_drective() should be necessary.

But - if all global symbols are going to be exported, doesn't that mean that garbage collection is going to be effectively useless.  (For a DLL anyway).  How many sections are there that contain data/code and which do not have at least one global symbol associated with them ?

Hmm, maybe garbage collection would be worthwhile if a DEF file is used.  Could you test to see if, with patch 1 applied, and a DEF file is used to only export some of the symbols, that garbage collection does indeed throw away unused data and code in a DLL ?

Cheers
  Nick
Comment 8 martin.koegler 2016-03-17 21:39:05 UTC
Created attachment 9106 [details]
Patch V1
Comment 9 martin.koegler 2016-03-17 21:47:30 UTC
I have reworked the first version a little bit.

The DEF-handling looks OK. SEH prevents the removal of a function, 

Get dx.s from Comment #2.

t1.s:
=============
        .file   "t1.c"
        .section        .text.unlikely,"x"
.LCOLDB0:
        .text
.LHOTB0:
        .p2align 4,,15
        .globl  add
        .def    add;    .scl    2;      .type   32;     .endef
        .seh_proc       add
add:
        .seh_endprologue
        ret
        .seh_endproc
        .section        .text.unlikely,"x"
.LCOLDE0:
        .text
.LHOTE0:
        .ident  "GCC: (GNU) 5.3.0"
===============

dx.def:
=============
EXPORTS
    testval @1 DATA
=============

x86_64-w64-mingw32-as -o t1.o t1.s
x86_64-w64-mingw32-as  -o dx.o  dx.s
x86_64-w64-mingw32-ld -o dx.dll -shared dx.o t1.o --print-gc-sections  --out-implib  dx.dll.a --gc-sections dx.def 

t1 is not removed, until the .seh statements are removed from t1.s.
Comment 10 martin.koegler 2016-03-17 21:53:10 UTC
(In reply to Nick Clifton from comment #7)
> True - but it looks to me like the data structures for generating the export
> table are set up far too early - ie before garbage collection - and the code
> in the bfd library that performs the garbage collection has no knowledge of,
> or access to, these data structures.
> 
> > In my option, your first patch didn't break gc-sections. The windows default
> > is to export all (global?) symbols from a DLL, if no specific export list is
> > provided via a DEF file.
> 
> Ah - OK, in which case most of the first patch is probably not needed.  Only
> the change to process_def_file_and_drective() should be necessary.
> 
> But - if all global symbols are going to be exported, doesn't that mean that
> garbage collection is going to be effectively useless.  (For a DLL anyway). 
> How many sections are there that contain data/code and which do not have at
> least one global symbol associated with them ?
> 
> Hmm, maybe garbage collection would be worthwhile if a DEF file is used. 
> Could you test to see if, with patch 1 applied, and a DEF file is used to
> only export some of the symbols, that garbage collection does indeed throw
> away unused data and code in a DLL ?

We can justify the current behaviour, that gc-sections should shrink the build as far as possible, but must not change the interface (=exported symbols).

On Linux:
$ cat t1.c 
void add()
{}
$ cat t2.c 
void sub()
{}
$ gcc -shared -o t.so t1.c t2.c -Wl,--print-gc-sections -Wl,--gc-sections
$ nm t.so
00000000000006e5 T add
0000000000201030 B __bss_start
0000000000201030 b completed.6337
                 w __cxa_finalize@@GLIBC_2.2.5
0000000000000600 t deregister_tm_clones
0000000000000670 t __do_global_dtors_aux
0000000000200df8 t __do_global_dtors_aux_fini_array_entry
0000000000201028 d __dso_handle
0000000000200e08 d _DYNAMIC
0000000000201030 D _edata
0000000000201038 B _end
00000000000006f4 T _fini
00000000000006b0 t frame_dummy
0000000000200df0 t __frame_dummy_init_array_entry
00000000000007a8 r __FRAME_END__
0000000000201000 d _GLOBAL_OFFSET_TABLE_
                 w __gmon_start__
00000000000005a8 T _init
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
0000000000200e00 d __JCR_END__
0000000000200e00 d __JCR_LIST__
                 w _Jv_RegisterClasses
0000000000000630 t register_tm_clones
00000000000006eb T sub
0000000000201030 d __TMC_END__

So gc-section on Linux (openSuSE 42.1) does not affect the auto-exported symbols.
Comment 11 Nick Clifton 2016-03-18 11:37:05 UTC
Created attachment 9107 [details]
Patch VII

Hi Martin,

  I like the patch, but I have one question: why did you need to use xstrdup to make a copy of the name in the exports array ?

  I have uploaded a tweaked version of your patch, which apart from ditching the xstrdup (which may be a mistake, depending upon your answer), also takes the code to add a symbol to the gc_sym_list and moves it into its open separate function.  The code was being duplicated in four different places, so it seemed like a good idea.

  Please could you give this patch a quick test, and if it is OK I will check it in.

Cheers
  Nick
Comment 12 martin.koegler 2016-03-18 18:42:05 UTC
Looks OK, but I have troubles applying it.

Additionally, there is 41f46ed9fea1a066de95b6a85c56393beef0b8b8 on master, which looks like an earlier version.

I'll try to test it.
Comment 13 martin.koegler 2016-03-18 19:13:47 UTC
Created attachment 9108 [details]
Cleaned patch

I'm currently testings this on a tree without 41f46ed9fea1a066de95b6a85c56393beef0b8b8
Comment 14 martin.koegler 2016-03-21 07:28:08 UTC
You can consider my xstrdup use as a bug [safest choice].

Binutils 2.26 + patch from Comment #13 works for a larger project for x86_64-w32-mingw. 
For i686-w64-mingw32, I have an issue related to libstdc++-6.dll _ZNSt8ios_base4InitC1Ev, which needs further investigations.
Comment 15 martin.koegler 2016-03-22 08:34:15 UTC
The remaining i686-w64-mingw32 problem:

It uses a different symbol name different to the exported symbol name:
Eg. testval is exported, but the symbol in the .o file is called _testval.

So symbols get removed.

dx.s:
===============
        .text
        .globl  DllMainCRTStartup
DllMainCRTStartup:
        movl    $1, %eax
        ret
        .globl  _testval
        .section .rdata,"dr"
_testval:
        .long   1
        .long   2
====================
dx.def:
=====================
EXPORTS
    testval @1 DATA
=====================

i686-w64-mingw32-as  -o dx.o  dx.s
i686-w64-mingw32-ld  -o dx.dll -shared dx.o dx1.o --print-gc-sections  --out-implib  dx.dll.a dx.def --gc-sections
Removing unused section '.text' in file 'dx.o'
Removing unused section '.rdata' in file 'dx.o'
Removing unused section '.text' in file 'dx1.o'
Comment 16 cvs-commit@gcc.gnu.org 2016-03-22 12:26:29 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=4153b6dbb0f38a16fd5b583761aa811212fbb9a5

commit 4153b6dbb0f38a16fd5b583761aa811212fbb9a5
Author: Nick Clifton <nickc@redhat.com>
Date:   Tue Mar 22 12:25:08 2016 +0000

    Improve COFF/PE linker garbage collection by preventing the removal of sections containing exported symbols.
    
    	PR ld/19803
    	* ldlang.c (lang_add_gc_name): New function.  Adds the provided
    	symbol name to the list of gc symbols.
    	(lang_process): Call lang_add_gc_name with entry_symbol_default if
    	entry_symbol.name is NULL.  Use lang_add_gc_name to add the init
    	and fini function names.
    	* pe-dll.c (process_def_file_and_drectve): Add exported names to
    	the gc symbol list.
    	* testsuite/ld-pe/pr19803.s: Do not export _testval symbol.
    	* testsuite/ld-pe/pr19803.d: Tweak expected output.
Comment 17 Nick Clifton 2016-03-22 12:29:01 UTC
Created attachment 9115 [details]
patch to add _ prefixed exported names

Hi Martin,

  OK - I have checked the current patch in.

  Does this little extra patch help with the underscore prefixed symbols ?

Cheers
  Nick
Comment 18 martin.koegler 2016-03-23 08:14:28 UTC
First, 41f46ed9fea1a066de95b6a85c56393beef0b8b8 (Fix possible failure in the AVR linker tests) introduced (propably accidentically) did a change to pe_dll_generate_implib not documented in the changelog. In my option, this change is not necessary, as it just would hides broken symbols in the export library.
A hidden symbol would be an official part of the DLL export table with nothing behind it (= not working symbol).

The prefix name patch would need a xstrdup, as name is freed some lines below.

There is second case missing (symbol matching implemented by pe_fixup_stdcalls).
Look at the entry symbol (_DllMainCRTStartup@12 linked to DllMainCRTStartup). According to the code, the other way around should be possible too.
Comment 19 Nick Clifton 2016-03-24 10:27:36 UTC
Created attachment 9121 [details]
Revised patch

Hi Martin,

> First, 41f46ed9fea1a066de95b6a85c56393beef0b8b8 (Fix possible failure in 
> the AVR linker tests) introduced (propably accidentically)

Yes, that was a snafu.  

> In my option, this change is not necessary, as it just would hides broken 
> symbols in the export library.

Actually I think that it is needed in order to prevent garbage collected symbols from leaking into the export table.

> The prefix name patch would need a xstrdup, as name is freed some lines below.

Doh, yes.

> There is second case missing (symbol matching implemented by 
> pe_fixup_stdcalls).

OK - this revised patch contains an attempt to fix this function as well.  Please could you give it a try and let me know the results.

Cheers
  Nick
Comment 20 martin.koegler 2016-03-25 08:07:30 UTC
(In reply to Nick Clifton from comment #19)
> > In my option, this change is not necessary, as it just would hides broken 
> > symbols in the export library.
> 
> Actually I think that it is needed in order to prevent garbage collected
> symbols from leaking into the export table.

Such symbols would still be exported in the DLL export table and ld can directly against the DLL binary [and therefore such symbols].

The goal of our last patches was to avoid that any exported symbol gets garbage collected - so a bug free ld shouldn't hit any such symbols.

The only benefit could be, that we generate a link error, if we hit a ld gc-bug and link against the import library.

As any such symbol is a bug, we could change it to a "you found a gc-section bug - please report it" error.

> > There is second case missing (symbol matching implemented by 
> > pe_fixup_stdcalls).
> 
> OK - this revised patch contains an attempt to fix this function as well. 
> Please could you give it a try and let me know the results.

It works for my simplified test cases. I found another issue with the entry symbol - lang_end interprets an undefined symbols in other ways too:
      /* We couldn't find the entry symbol.  Try parsing it as a
         number.  */
     /* Can't find the entry symbol, and it's not a number.  Use
         the first address in the text section.  */

Try my sample of Comment #15 - the text section is still removed, although ld uses the first address of the text section.
Comment 21 martin.koegler 2016-04-03 11:12:47 UTC
(In reply to Nick Clifton from comment #19)
> OK - this revised patch contains an attempt to fix this function as well. 
> Please could you give it a try and let me know the results.

Please apply that patch - works for a larger real world testcase (including gcc rebuild).

There are a few corner cases:
* If the startup symbol is not found, gc-section produces different binaries, because it does not protect the startup location (see my last comment). This affects ELF too:

t.c
====
int test()
{
return 0;
}
====
gcc -c t.c
ld -o t t.o /usr/lib/gcc/x86_64-linux-gnu/4.9.2/crtbegin.o /usr/lib/gcc/x86_64-linux-gnu/4.9.2/crtend.o  --gc-sections --print-gc-sections
ld -o t1 t.o /usr/lib/gcc/x86_64-linux-gnu/4.9.2/crtbegin.o /usr/lib/gcc/x86_64-linux-gnu/4.9.2/crtend.o  --print-gc-sections

t and t1 will start at the same location, which contains a different function

* SEH inhibit the removal of functions. This does not break anything.
Comment 22 cvs-commit@gcc.gnu.org 2016-04-04 12:46:06 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=a061de07e441718d3658be332fd3172d87c7440b

commit a061de07e441718d3658be332fd3172d87c7440b
Author: Nick Clifton <nickc@redhat.com>
Date:   Mon Apr 4 13:44:57 2016 +0100

    More fixes for COFF/PE lanker garbage collection.
    
    	PR 19803
    	* emultempl/pe.em (change_undef): New function.  Encapsulates
    	duplicated code in pe_fixup_stdcalls and adds the newly defined
    	sym to the gc root list.
    	(pe_fixup_stdcall): Use the new function.
    	* pe-dll.c (process_def_file_and_drectve); Add alias of exported
    	symbol to gc root list.