This is the mail archive of the archer@sourceware.org mailing list for the Archer project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Inter-CU DWARF size optimizations and gcc -flto


Hi,

I am sorry if it is clear to everyone but I admit I played with it only
yesterday.

With
	gcc -flto -flto-partition=none

gcc outputs only single CU (Compilation Unit).  With default (omitting)
-flto-partition there are multiple CUs but still a few compared to the number
of .o files.

-flto is AFAIK the future for all the compilations.  It is well known -flto
debug info is somehow broken now but that needs to be fixed anyway.

As the DWARF size is being discussed for 5+ years I am in Tools this is
a long-term project and waiting for (helping, heh) working -flto is an
acceptable solution.

This has some implications:

(a) DWARF post-processing optimization tool no longer makes sense with -flto.

    (a1) Intra-CU optimizations in GCC make sense as it is the final output.

(b) .gdb_index will have limited scope, only to select which objfiles to expand,
    no longer to select which CUs to expand.

(c) Partial CU expansion Tom Tromey talks about is a must in such case.
    Although the smaller LTO debug info takes only 63% of GDB memory
    requirements compared to the non-LTO (many-CUs) debug info.
    (GDB memory requirement is not directly proportional ot the DWARF size)

With -flto-partition=none linking of GDB took about 900MB.  Honza Hubicka's
memory requirements for LTO (2.7GB for Mozilla) not sure how were related to
-flto-partition.  Still some GBs of cheap memory for the few hosts in build
farm (Koji) for Mozilla + LibreOffice should not be such a concern IMO.

FYI for gdb with Rawhide -O2-style CFLAGS (-gdwarf-4 -fno-debug-types-section):

-fno-debug-types-section:
                       |  non-LTO  |    LTO
stripped binary size   |   5023064 |   4985864
separate .debug size   |  19190280 |  12484312 =65%
GDB RSS -readnow       | 160136 KB | 106252 KB
GDB RSS without .debug |  14964 KB |  14972 KB
GDB RSS difference     | 145172 KB |  91280 KB =63%

I had an idea those 65% (35% reduction) could be the magic ratio achievable by
the hypothetically optimal "Roland's" DWARF optimizer.  But at least struct
range_bounds is there defined (including all its fields) 49x so this is still
far from optimal/"Roland's one".

Additionally with -fdebug-types-section:
                       v like above
                       |  non-LTO  |  non-LTO .debug_types | LTO .debug_types
stripped binary size   |   5023064 |  5023064              |  4985864
separate .debug size   |  19190280 | 12789960 = 67%        | 12170080 = 63%
GDB RSS -readnow       | 160136 KB |  77524 KB             | 227876 KB
GDB RSS without .debug |  14964 KB |  14968 KB             |  14964 KB
GDB RSS difference     | 145172 KB |  62556 KB = 43%       | 212912 KB = 147%

This has IMO some implications:

(z) gcc/dwarf2out.c is a viable place where to implement "Roland's" DWARF
    optimizer.


Regards,
Jan


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]