RFC: support of PGO?
Martin Liška
mliska@suse.cz
Thu Oct 8 15:06:37 GMT 2020
Hey.
As you probably know, profile-guided optimization can improve a built
binary quite significantly. I've taken the example from
https://sourceware.org/bugzilla/show_bug.cgi?id=26381
and there are speed numbers for GAS:
1) normal build (-O2): 2.7 seconds
2) LTO build: 2.57 seconds
3) LTO+PGO build: 2.3 seconds
that's about 15% speed gain which is not negligible.
Right now I make the PGO with:
export C{,XX}FLAGS "-fprofile-generate=/tmp/binutils -O2 -flto"
./configure
make
make check
rm * -rf
export C{,XX}FLAGS "-fprofile-use=tmp/binutils -O2 -flto"
./configure
make
Questions I have:
- Would it be possible to integrate PGO into the root Makefile?
- Or do you prefer the way I made the PGO?
- I see quite some failures in ld test-suite:
gcc -B/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/ -L/usr/local/x86_64-pc-linux-gnu/lib64 -L/usr/local/lib64 -L/lib64 -L/usr/lib64 -L/usr/local/x86_64-pc-linux-gnu/lib -L/usr/local/lib -L/lib -L/usr/lib -o tmpdir/ld1 tmpdir/ld-partial.o ../libctf/.libs/libctf.a -L./../zlib -lz ../bfd/.libs/libbfd.a ../libiberty/libiberty.a -L../zlib -lz -ldl
Executing on host: sh -c {gcc -B/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/ -L/usr/local/x86_64-pc-linux-gnu/lib64 -L/usr/local/lib64 -L/lib64 -L/usr/lib64 -L/usr/local/x86_64-pc-linux-gnu/lib -L/usr/local/lib -L/lib -L/usr/lib -o tmpdir/ld1 tmpdir/ld-partial.o ../libctf/.libs/libctf.a -L./../zlib -lz ../bfd/.libs/libbfd.a ../libiberty/libiberty.a -L../zlib -lz -ldl 2>&1} /dev/null ld.tmp (timeout = 300)
spawn [open ...]
/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/collect-ld: /tmp/ld1.g2iFBN.ltrans0.ltrans.o: in function `yy_get_previous_state':
/home/marxin/Programming/binutils/objdir/ld/ldlex.c:3527: undefined reference to `__gcov_time_profiler_counter'
/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/collect-ld: /home/marxin/Programming/binutils/objdir/ld/ldlex.c:3527: undefined reference to `__gcov_time_profiler_counter'
/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/collect-ld: /tmp/ld1.g2iFBN.ltrans0.ltrans.o: in function `yy_load_buffer_state':
/home/marxin/Programming/binutils/objdir/ld/ldlex.c:3713: undefined reference to `__gcov_time_profiler_counter'
/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/collect-ld: /home/marxin/Programming/binutils/objdir/ld/ldlex.c:3713: undefined reference to `__gcov_time_profiler_counter'
/home/marxin/Programming/binutils/objdir/ld/tmpdir/ld/collect-ld: /tmp/ld1.g2iFBN.ltrans0.ltrans.o: in function `yy_stack_print':
which means the libiberty built with PGO is missing -fprofile-generate in linker argument.
Can one pass LDFALGS to 'make check'?
Note that even a test-suite provides a reasonable profile for PGO that positively influences
performance. I also expect similar speed improvement from other binutils tools
(most notable ld and ld.gold).
Thoughts?
Thanks,
Martin
More information about the Binutils
mailing list