Range lists, zero-length functions, linker gc

Fangrui Song maskray@google.com
Sun May 31 20:47:38 GMT 2020


On 2020-05-31, Mark Wielaard wrote:
>Hi,
>
>On Sun, May 31, 2020 at 11:55:06AM -0700, Fangrui Song via Elfutils-devel wrote:
>> what linkers should do regarding relocations referencing dropped
>> functions (due to section group rules, --gc-sections, /DISCARD/,
>> etc) in .debug_*
>>
>> As an example:
>>
>>   __attribute__((section(".text.x"))) void f1() { }
>>   __attribute__((section(".text.x"))) void f2() { }
>>   int main() { }
>>
>> Some .debug_* sections are relocated by R_X86_64_64 referencing
>> undefined symbols (the STT_SECTION symbols are collected):
>>
>>   0x00000043:   DW_TAG_subprogram [2]
>>                   ###### relocated by .text.x + 10
>>                   DW_AT_low_pc [DW_FORM_addr]     (0x0000000000000010 ".text.x")
>>                   DW_AT_high_pc [DW_FORM_data4]   (0x00000006)
>>                   DW_AT_frame_base [DW_FORM_exprloc]      (DW_OP_reg6 RBP)
>>                   DW_AT_linkage_name [DW_FORM_strp]       ( .debug_str[0x0000002c] = "_Z2f2v")
>>                   DW_AT_name [DW_FORM_strp]       ( .debug_str[0x00000033] = "f2")
>>
>>
>> With ld --gc-sections:
>>
>> * DW_AT_low_pc [DW_FORM_addr] in .debug_info are resolved to 0 +
>>   addend This can cause overlapping address ranges with normal text
>>   sections. {{overlap}} * [beginning address offset, ending address
>>   offset) in .debug_ranges are resolved to 1 (ignoring addend).  See
>>   bfd/reloc.c (behavior introduced in
>>   https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3
>>   )
>>
>>   [0, 0) cannot be used because it terminates the list entry.
>>   [-1, -1) cannot be used because -1 represents a base address
>>   selection entry which will affect subsequent address offset
>>   pairs.
>> * .debug_loc address offset pairs have similar problem to .debug_ranges
>> * In DWARF v5, the abnormal values can be in a separate section .debug_addr
>>
>> ---
>>
>> I am eager to know what you think
>> of the ideas from binutils/gdb/elfutils's perspective.
>
>I think this is a producer problem. If a (code) section can be totally
>dropped then the associated (.debug) sections should have been
>generated together with that (code) section in a COMDAT group. That
>way when the linker drops that section, all the associated sections in
>that COMDAT group will get dropped with it. If you don't do that, then
>the DWARF is malformed and there is not much a consumer can do about
>it.
>
>Said otherwise, I don't think it is correct for the linker (with
>--gc-sections) to drop any sections that have references to it
>(through relocation symbols) from other (.debug) sections.

I would love if we could solve the problem using ELF features, but
putting DW_TAG_subprogram in the same section group is not an
unqualified win
(https://lists.llvm.org/pipermail/llvm-dev/2020-May/141926.html)
(Cost: sizeof(Elf64_Shdr) = 64, Elf_Word for the entry in .group, plus
a string in .strtab unless you use the string ".debug_info"
(reusing the string requires https://sourceware.org/bugzilla/show_bug.cgi?id=25380))

According to Peter Smith in the thread
https://groups.google.com/forum/#!msg/generic-abi/A-1rbP8hFCA/EDA7Sf3KBwAJ ,
Arm Compiler 5 splits up DWARF v3 debugging information and puts these sections
into comdat groups:

"This approach did produce significantly more debug information than gcc
  did. For small microcontroller projects this wasn't a problem. For
  larger feature phone problems we had to put a lot of work into keeping
  the linker's memory usage down as many of our customers at the time were
  using 32-bit Windows machines with a default maximum virtual memory of 2Gb."

See Ben, Ali and others' comments in the thread. Fragmented .debug_* may
not be practical.


More information about the Elfutils-devel mailing list