RFC: support of PGO?

Martin Liška mliska@suse.cz
Thu Oct 8 15:06:37 GMT 2020


Hey.

As you probably know, profile-guided optimization can improve a built
binary quite significantly. I've taken the example from
https://sourceware.org/bugzilla/show_bug.cgi?id=26381
and there are speed numbers for GAS:

1) normal build (-O2): 2.7 seconds
2) LTO build: 2.57 seconds
3) LTO+PGO build: 2.3 seconds

that's about 15% speed gain which is not negligible.

Right now I make the PGO with:

export C{,XX}FLAGS "-fprofile-generate=/tmp/binutils -O2 -flto"
./configure
make
make check
rm * -rf
export C{,XX}FLAGS "-fprofile-use=tmp/binutils -O2 -flto"
./configure
make

Questions I have:
- Would it be possible to integrate PGO into the root Makefile?
- Or do you prefer the way I made the PGO?
- I see quite some failures in ld test-suite:

gcc -B/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/   -L/usr/local/x86_64-pc-linux-gnu/lib64 -L/usr/local/lib64 -L/lib64 -L/usr/lib64 -L/usr/local/x86_64-pc-linux-gnu/lib -L/usr/local/lib -L/lib -L/usr/lib  -o tmpdir/ld1  tmpdir/ld-partial.o ../libctf/.libs/libctf.a -L./../zlib -lz ../bfd/.libs/libbfd.a ../libiberty/libiberty.a   -L../zlib -lz -ldl
Executing on host: sh -c {gcc -B/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/   -L/usr/local/x86_64-pc-linux-gnu/lib64 -L/usr/local/lib64 -L/lib64 -L/usr/lib64 -L/usr/local/x86_64-pc-linux-gnu/lib -L/usr/local/lib -L/lib -L/usr/lib  -o tmpdir/ld1  tmpdir/ld-partial.o ../libctf/.libs/libctf.a -L./../zlib -lz ../bfd/.libs/libbfd.a ../libiberty/libiberty.a   -L../zlib -lz -ldl 2>&1}  /dev/null ld.tmp (timeout = 300)
spawn [open ...]

/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/collect-ld: /tmp/ld1.g2iFBN.ltrans0.ltrans.o: in function `yy_get_previous_state':
/home/marxin/Programming/binutils/objdir/ld/ldlex.c:3527: undefined reference to `__gcov_time_profiler_counter'
/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/collect-ld: /home/marxin/Programming/binutils/objdir/ld/ldlex.c:3527: undefined reference to `__gcov_time_profiler_counter'
/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/collect-ld: /tmp/ld1.g2iFBN.ltrans0.ltrans.o: in function `yy_load_buffer_state':
/home/marxin/Programming/binutils/objdir/ld/ldlex.c:3713: undefined reference to `__gcov_time_profiler_counter'
/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/collect-ld: /home/marxin/Programming/binutils/objdir/ld/ldlex.c:3713: undefined reference to `__gcov_time_profiler_counter'
/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/collect-ld: /tmp/ld1.g2iFBN.ltrans0.ltrans.o: in function `yy_stack_print':

which means the libiberty built with PGO is missing -fprofile-generate in linker argument.
Can one pass LDFALGS to 'make check'?

Note that even a test-suite provides a reasonable profile for PGO that positively influences
performance. I also expect similar speed improvement from other binutils tools
(most notable ld and ld.gold).

Thoughts?
Thanks,
Martin


More information about the Binutils mailing list