[PATCH v2 2/5] Restrict use of minsym names when printing addresses in disassembled code

Fri Jul 19 17:07:00 GMT 2019

Below, more of a brain dump than an objection.  FWIW.

On 7/4/19 5:55 AM, Kevin Buettner wrote:
> 
> (gdb) x/5i foo_cold
>    0x401128 <foo_cold>:	push   %rbp
>    0x401129 <foo+35>:	mov    %rsp,%rbp
>    0x40112c <foo+38>:	callq  0x401134 <baz>
>    0x401131 <foo+43>:	nop
>    0x401132 <foo+44>:	pop    %rbp

I admit that it took me a while to reply because I'm still finding
it a bit hard to convince myself that that is the ideal output.
E.g., the first two instructions are obviously part of the same
prologue, I find it a bit odd that the disassembly shows
different function names for that instruction pair.

Maybe this won't matter in practice since IIUC your testcase is a
bit contrived, and real cold functions are just fragments of
functions jmp/ret'ed to/from, without a prologue.

BTW, I noticed (with your series applied), this divergence:

 (gdb) x /5i foo_cold
    0x40048e <foo_cold>: push   %rbp
    0x40048f <foo-18>:   mov    %rsp,%rbp
    0x400492 <foo-15>:   callq  0x400487 <baz>
    0x400497 <foo-10>:   nop
    0x400498 <foo-9>:    pop    %rbp

vs:

 (gdb) info symbol 0x40048f
 foo_cold + 1 in section .text
 (gdb) info symbol 0x400492
 foo_cold + 4 in section .text
 (gdb) info symbol 
 Argument required (address).
 (gdb) info symbol 0x400497
 foo_cold + 9 in section .text

That's of course because "info symbol" only looks at minimal symbols.

On the disassemble side, I think I'd be less confused with:

(gdb) disassemble foo
Dump of assembler code for function foo:
Address range 0x4004a1 to 0x4004bc:
   0x00000000004004a1 <foo+0>:     push   %rbp
   0x00000000004004a2 <foo+1>:     mov    %rsp,%rbp
   0x00000000004004a5 <foo+4>:     callq  0x40049a <bar>
   0x00000000004004aa <foo+9>:     mov    0x200b70(%rip),%eax        # 0x601020 <e>
   0x00000000004004b0 <foo+15>:    test   %eax,%eax
   0x00000000004004b2 <foo+17>:    je     0x4004b9 <foo+24>
   0x00000000004004b4 <foo+19>:    callq  0x40048e <foo_cold>
   0x00000000004004b9 <foo+24>:    nop
   0x00000000004004ba <foo+25>:    pop    %rbp
   0x00000000004004bb <foo+26>:    retq   
Address range 0x40048e to 0x40049a:
   0x000000000040048e <foo.cold+0>:    push   %rbp
   0x000000000040048f <foo.cold+1>:    mov    %rsp,%rbp
   0x0000000000400492 <foo.cold+4>:    callq  0x400487 <baz>
   0x0000000000400497 <foo.cold+9>:    nop
   0x0000000000400498 <foo.cold+10>:   pop    %rbp
   0x0000000000400499 <foo.cold+11>:   retq   
End of assembler dump.

Instead of:

(gdb) disassemble foo
Dump of assembler code for function foo:
Address range 0x4004a1 to 0x4004bc:
   0x00000000004004a1 <+0>:     push   %rbp
   0x00000000004004a2 <+1>:     mov    %rsp,%rbp
   0x00000000004004a5 <+4>:     callq  0x40049a <bar>
   0x00000000004004aa <+9>:     mov    0x200b70(%rip),%eax        # 0x601020 <e>
   0x00000000004004b0 <+15>:    test   %eax,%eax
   0x00000000004004b2 <+17>:    je     0x4004b9 <foo+24>
   0x00000000004004b4 <+19>:    callq  0x40048e <foo_cold>
   0x00000000004004b9 <+24>:    nop
   0x00000000004004ba <+25>:    pop    %rbp
   0x00000000004004bb <+26>:    retq   
Address range 0x40048e to 0x40049a:
   0x000000000040048e <-19>:    push   %rbp
   0x000000000040048f <-18>:    mov    %rsp,%rbp
   0x0000000000400492 <-15>:    callq  0x400487 <baz>
   0x0000000000400497 <-10>:    nop
   0x0000000000400498 <-9>:     pop    %rbp
   0x0000000000400499 <-8>:     retq   
End of assembler dump.

In your examples / testcase, the cold function is adjacent/near
the hot / main entry point of foo.  But I think that
on a real cold function, the offsets between the cold and
hot entry points can potentially be much larger (e.g., place in
different elf sections), as the point is exactly to
move cold code away from the hot path / cache.

That that would mean that we're likely to see output like:

(gdb) disassemble foo
Dump of assembler code for function foo:
Address range 0x6004a1 to 0x6004bc:
   0x00000000006004a1 <+0>:     push   %rbp
   0x00000000006004a2 <+1>:     mov    %rsp,%rbp
   0x00000000006004a5 <+4>:     callq  0x40049a <bar>
   0x00000000006004aa <+9>:     mov    0x200b70(%rip),%eax        # 0x601020 <e>
   0x00000000006004b0 <+15>:    test   %eax,%eax
   0x00000000006004b2 <+17>:    je     0x6004b9 <foo+24>
   0x00000000006004b4 <+19>:    callq  0x40048e <foo_cold>
   0x00000000006004b9 <+24>:    nop
   0x00000000006004ba <+25>:    pop    %rbp
   0x00000000006004bb <+26>:    retq   
Address range 0x40048e to 0x40049a:
   0x000000000040048e <-2097171>:    push   %rbp
   0x000000000040048f <-2097170>:    mov    %rsp,%rbp
   0x0000000000400492 <-2097167>:    callq  0x400487 <baz>
   0x0000000000400497 <-2097162>:    nop
   0x0000000000400498 <-2097161>:    pop    %rbp
   0x0000000000400499 <-2097160>:    retq   
End of assembler dump.

... and those negative offsets kind of look a bit odd.

But maybe it's just that I'm not thinking it right.  

If I think of the offset in terms of offset from the
"foo"'s entry point, which is what it really is, then I can
convince myself that I can explain why that's the right output,
pedantically.

So I guess I'll grow into it.

And with that out of the way, the series looks good to me as is.

Thanks,
Pedro Alves