Bug 31955 - gas: Extend .loc directive to emit a label
Summary: gas: Extend .loc directive to emit a label
Status: NEW
Alias: None
Product: binutils
Classification: Unclassified
Component: gas (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-07-04 19:29 UTC by Fangrui Song
Modified: 2024-07-20 00:11 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fangrui Song 2024-07-04 19:29:24 UTC
I have noticed that Meta Platforms folks have a proposal to extend the .loc directive https://discourse.llvm.org/t/rfc-extending-llvm-mc-loc-directive-with-labeling-support/79608 

For your convenience, the gas documentation is at https://sourceware.org/binutils/docs/as/Loc.html

> The .loc directive will add a row to the .debug_line line number matrix corresponding to the immediately following assembly instruction.

Here is my summary of their proposal:

Clang will add a new debug mode to emit a DW_AT_LLVM_stmt_sequence attribute to each DW_TAG_subprogram DIE, referencing the start of a line number program sequence associated with the subprogram.

```
main:
  .loc 0 1 13  debug_line_label .Lmain_line_entries
  ...

.section .debug_info,"",@progbits
  ...
  .byte	14                                           # Abbrev [14] DW_TAG_subprogram
  ...
  .long	.Lmain_line_entries - .lline_table_start0    # DW_AT_LLVM_stmt_sequence

.section        .debug_line,"",@progbits # generated
# Conceptually, the .Lmain_line_entries label is emitted at start of a line number program sequence associated with `main`
```

Advantages.

*Faster symbolization*

Traditional address symbolization involves locating the DW_TAG_compile_unit DIE and parsing the line number program from the DW_AT_stmt_list offset.
This process requires skipping unrelated DW_TAG_subprogram DIEs. The DW_AT_LLVM_stmt_sequence attribute directly points to the relevant line number program sequence, eliminating unnecessary steps.

*Improved ICF disambiguating*

Identical Code Folding (ICF) can make two line number program sequences (associated with folded subprograms) indistinguishable.
While DW_AT_LLVM_stmt_sequence doesn't resolve this, it identifies the associated function. If the caller is known, this additional information could help disambiguate the correct sequence.

---

ELF/COFF -fno-function-sections and Mach-O .subsections_via_symbols allow consecutive functions to share the same line number program sequence.
To utilize DW_AT_LLVM_stmt_sequence better, the sequences should be split to resemble ELF/COFF -ffunction-sections.

Mach-O doesn't have -ffunction-sections -fno-function-sections differences and normally needs very few relocations for .debug_line and the new mode will introduce more like ELF/COFF -ffunction-sections.
(On ELF, the relocation overhead can be addressed by adopting CREL https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf )
Comment 1 Fangrui Song 2024-07-08 17:23:55 UTC
Perhaps call this .loc_label <label> , since .cfi_label <label> (2015) is available.