This is the mail archive of the
archer@sourceware.org
mailing list for the Archer project.
Inter-CU DWARF size optimizations and gcc -flto
- From: Jan Kratochvil <jan dot kratochvil at redhat dot com>
- To: archer at sourceware dot org
- Cc: Jakub Jelinek <jakub at redhat dot com>
- Date: Wed, 1 Feb 2012 14:23:09 +0100
- Subject: Inter-CU DWARF size optimizations and gcc -flto
Hi,
I am sorry if it is clear to everyone but I admit I played with it only
yesterday.
With
gcc -flto -flto-partition=none
gcc outputs only single CU (Compilation Unit). With default (omitting)
-flto-partition there are multiple CUs but still a few compared to the number
of .o files.
-flto is AFAIK the future for all the compilations. It is well known -flto
debug info is somehow broken now but that needs to be fixed anyway.
As the DWARF size is being discussed for 5+ years I am in Tools this is
a long-term project and waiting for (helping, heh) working -flto is an
acceptable solution.
This has some implications:
(a) DWARF post-processing optimization tool no longer makes sense with -flto.
(a1) Intra-CU optimizations in GCC make sense as it is the final output.
(b) .gdb_index will have limited scope, only to select which objfiles to expand,
no longer to select which CUs to expand.
(c) Partial CU expansion Tom Tromey talks about is a must in such case.
Although the smaller LTO debug info takes only 63% of GDB memory
requirements compared to the non-LTO (many-CUs) debug info.
(GDB memory requirement is not directly proportional ot the DWARF size)
With -flto-partition=none linking of GDB took about 900MB. Honza Hubicka's
memory requirements for LTO (2.7GB for Mozilla) not sure how were related to
-flto-partition. Still some GBs of cheap memory for the few hosts in build
farm (Koji) for Mozilla + LibreOffice should not be such a concern IMO.
FYI for gdb with Rawhide -O2-style CFLAGS (-gdwarf-4 -fno-debug-types-section):
-fno-debug-types-section:
| non-LTO | LTO
stripped binary size | 5023064 | 4985864
separate .debug size | 19190280 | 12484312 =65%
GDB RSS -readnow | 160136 KB | 106252 KB
GDB RSS without .debug | 14964 KB | 14972 KB
GDB RSS difference | 145172 KB | 91280 KB =63%
I had an idea those 65% (35% reduction) could be the magic ratio achievable by
the hypothetically optimal "Roland's" DWARF optimizer. But at least struct
range_bounds is there defined (including all its fields) 49x so this is still
far from optimal/"Roland's one".
Additionally with -fdebug-types-section:
v like above
| non-LTO | non-LTO .debug_types | LTO .debug_types
stripped binary size | 5023064 | 5023064 | 4985864
separate .debug size | 19190280 | 12789960 = 67% | 12170080 = 63%
GDB RSS -readnow | 160136 KB | 77524 KB | 227876 KB
GDB RSS without .debug | 14964 KB | 14968 KB | 14964 KB
GDB RSS difference | 145172 KB | 62556 KB = 43% | 212912 KB = 147%
This has IMO some implications:
(z) gcc/dwarf2out.c is a viable place where to implement "Roland's" DWARF
optimizer.
Regards,
Jan