Bug 27226 - ld.bfd contains huge .rodata section
Summary: ld.bfd contains huge .rodata section
Status: UNCONFIRMED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.35
: P2 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-22 09:25 UTC by Martin Liska
Modified: 2022-09-11 22:16 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
A POC (6.81 KB, patch)
2021-01-24 21:04 UTC, H.J. Lu
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Liska 2021-01-22 09:25:19 UTC
Looking at the size of ld.bfd binary I see:

$ bloaty /usr/bin/ld.bfd
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  89.5%  7.78Mi  89.5%  7.78Mi    .rodata
   7.0%   623Ki   7.0%   623Ki    .text
   1.2%   103Ki   1.2%   103Ki    .rela.dyn

Where the .rodata section is full of linker scripts:

$ strings /usr/bin/ld.bfd | grep 'this script' | wc -l
927

So there are almost 1000 linker scripts. Would it be possible to save the scripts in a more compressed form?

$ du -hs ld.bfd
8.7M	ld.bfd
$ xz ld.bfd
$ du -hs ld.bfd.xz
316K	ld.bfd.xz

So one can easily compress the binary size 30x.
Comment 1 Martin Liska 2021-01-22 09:32:50 UTC
I've just tried to use eu-elfcompress, but it does not work for some reason:

$ eu-elfcompress -n '.*rodata' -t zlib ld.bfd -o ld.xxx -v
processing: ld.bfd
[17] .rodata ignoring allocated section

@Mark: Can you please take a look?
Comment 2 Mark Wielaard 2021-01-22 09:51:17 UTC
> $ eu-elfcompress -n '.*rodata' -t zlib ld.bfd -o ld.xxx -v
> processing: ld.bfd
> [17] .rodata ignoring allocated section

It is as the message says. .rodata is an allocated section (SHF_ALLOC) and so cannot be ELF section compressed. This is explicitly not allowed by the spec because it would make mapping such a section into memory awkward (there is no corresponding program header flag, so it would be unclear how the dynamic linker should map a segment that contains a compressed section, especially since the segment/section mapping isn't 1-on-1 and can be about partial sections).
Comment 3 H.J. Lu 2021-01-24 21:04:14 UTC
Created attachment 13152 [details]
A POC

On Linux/x86-64

   text	   data	    bss	    dec	    hex	filename
2606362	   8376	   6072	2620810	 27fd8a	ld.orig
1901779	   8296	   6072	1916147	 1d3cf3	ld.compressed

Use zstd may use less space.
Comment 4 Martin Liska 2021-01-25 11:10:29 UTC
(In reply to H.J. Lu from comment #3)
> Created attachment 13152 [details]
> A POC
> 
> On Linux/x86-64
> 
>    text	   data	    bss	    dec	    hex	filename
> 2606362	   8376	   6072	2620810	 27fd8a	ld.orig
> 1901779	   8296	   6072	1916147	 1d3cf3	ld.compressed
> 
> Use zstd may use less space.

Thank you for the work H.J. I can see the approach helps a bit, but we would get a better compression ratio if all linker scripts are compressed at once.

Btw. why are the scripts embedded in ld.bfd? Can't be they distributed as a list of configuration files?
Comment 5 Alan Modra 2021-01-25 14:20:50 UTC
Whether scripts are compiled in or not is supposed to be under control of ld/genscripts.sh COMPILE_IN, but there are problems in these files
	modified:   ld/emulparams/alphavms.sh
	modified:   ld/emulparams/elf64_ia64_vms.sh
	modified:   ld/emulparams/elf64mmix.sh
	modified:   ld/emulparams/elf_iamcu.sh
	modified:   ld/emulparams/elf_k1om.sh
	modified:   ld/emulparams/elf_l1om.sh
	modified:   ld/emulparams/mmo.sh
	modified:   ld/emulparams/pdp11.sh
	modified:   ld/emultempl/beos.em
	modified:   ld/emultempl/pdp11.em
	modified:   ld/emultempl/pe.em
	modified:   ld/emultempl/pep.em
and yes, I have a patch under test.
Comment 6 H.J. Lu 2021-01-25 15:02:48 UTC
(In reply to Martin Liska from comment #4)
> (In reply to H.J. Lu from comment #3)
> > Created attachment 13152 [details]
> > A POC
> > 
> > On Linux/x86-64
> > 
> >    text	   data	    bss	    dec	    hex	filename
> > 2606362	   8376	   6072	2620810	 27fd8a	ld.orig
> > 1901779	   8296	   6072	1916147	 1d3cf3	ld.compressed
> > 
> > Use zstd may use less space.
> 
> Thank you for the work H.J. I can see the approach helps a bit, but we would
> get a better compression ratio if all linker scripts are compressed at once.


That will require non-trivial changes in ld.

> Btw. why are the scripts embedded in ld.bfd? Can't be they distributed as a
> list of configuration files?

You can just copy ld to another machine.
Comment 7 Sourceware Commits 2021-01-26 10:30:40 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=1c9c9b9b55520b36c15af94ee0803f0922b3ca09

commit 1c9c9b9b55520b36c15af94ee0803f0922b3ca09
Author: Alan Modra <amodra@gmail.com>
Date:   Tue Jan 26 10:48:09 2021 +1030

    PR27226, ld.bfd contains huge .rodata section
    
    This makes it possible to build ld without any compiled-in scripts,
    by setting COMPILE_IN=no in the environment.  pe, beos and pdp11
    targets didn't support scripts from the file system, with pdp11
    nastily editing the ld/ldscripts file so that the built-in script
    didn't match.
    
            PR 27226
            * emulparams/alphavms.sh: Don't set COMPILE_IN.
            * emulparams/elf64_ia64_vms.sh: Likewise.
            * emulparams/elf64mmix.sh: Likewise.
            * emulparams/elf_iamcu.sh: Likewise.
            * emulparams/elf_k1om.sh: Likewise.
            * emulparams/elf_l1om.sh: Likewise.
            * emulparams/mmo.sh: Likewise.
            * emulparams/pdp11.sh: Set DATA_SEG_ADDR.
            * scripttempl/pdp11.sc: Use it.
            * emultempl/pdp11.em: Don't edit .xn script for separate_code,
            instead use .xe script.  Support scripts from file system.
            * emultempl/beos.em: Support scripts from file system.
            * emultempl/pe.em: Likewise.
            * emultempl/pep.em: Likewise.
            * testsuite/ld-bootstrap/bootstrap.exp: Make tmpdir/ldscripts link.