[PATCH 3/4] Allow display of negative offsets in print_address_symbolic()

Kevin Buettner kevinb@redhat.com
Wed Jul 3 23:09:00 GMT 2019


On Fri, 21 Jun 2019 15:45:13 +0100
Pedro Alves <palves@redhat.com> wrote:

> On 6/8/19 8:54 PM, Kevin Buettner wrote:
> > When examining addresses associated with blocks with non-contiguous
> > address ranges, it's not uncommon to see large positive offsets which,
> > for some address width, actually represent a smaller negative offset.
> > Here's an example taken from the test case:
> > 
> >     (gdb) x/i foo_cold
> >        0x40110d <foo+4294967277>:	push   %rbp
> > 
> > This commit causes cases like the above to be displayed like this (below)
> > instead:
> > 
> >     (gdb) x/i foo_cold
> >        0x40110d <foo-19>:	push   %rbp
> > 
> > gdb/ChangeLog:
> > 
> > 	* printcmd.c (print_address_symbolic): Print negative offsets.
> > 	(build_address_symbolic): Force signed arithmetic when computing
> > 	offset.  
> 
> Seems reasonable to me, if we assume that the symbol name to put
> within <> is "foo".
> 
> This change makes makes me doubt that, though.  We're looking at
> the lower level, disassembly code.  I think I'd want to see
> 
>   0x40110d <foo_cold+0>:
> 
> there?
> 
> E.g., I might want to follow up with
> disassemble foo_cold.
> 
> But the present state of things, I wouldn't be able to see the
> foo_cold symbol, where it starts?
> 
> Maybe a larger disassemble output including several cold sections
> in view would help determine the best output.

I've been conducting some experiments with this patch...

diff --git a/gdb/printcmd.c b/gdb/printcmd.c
index 886e4464df..e6599493ff 100644
--- a/gdb/printcmd.c
+++ b/gdb/printcmd.c
@@ -629,9 +629,15 @@ build_address_symbolic (struct gdbarch *gdbarch,
 
   if (msymbol.minsym != NULL)
     {
+#if 1
       if (symbol == NULL
           || (BLOCK_CONTIGUOUS_P (SYMBOL_BLOCK_VALUE (symbol))
 	      && BMSYMBOL_VALUE_ADDRESS (msymbol) > name_location))
+#else
+      if (symbol == NULL
+          || !BLOCK_CONTIGUOUS_P (SYMBOL_BLOCK_VALUE (symbol))
+	  || BMSYMBOL_VALUE_ADDRESS (msymbol) > name_location)
+#endif
 	{
 	  /* If this is a function (i.e. a code address), strip out any
 	     non-address bits.  For instance, display a pointer to the

...applied on top of the other patches in this set.

As shown, GDB will prefer the symtab symbol over the minsym for
display of symbols in disassembled code.  When I change the #if 1
to #if 0, GDB will always prefer the minsym for functions with
non-contiguous ranges.  (Whatever it is that we do, I want the
"lo-cold" and "hi-cold" cases to behave the same.)

Then, using the "lo-cold" executable for the dw2-ranges-func test,
I've been doing the following:

./gdb testsuite/outputs/gdb.dwarf2/dw2-ranges-func/dw2-ranges-func-lo-cold
b 70
b baz
run
set var e=1
c
bt
up
x/5i $pc
disassemble foo
x/i foo_cold
disassemble foo_cold

I don't see anything interesting until we get to the "bt" command,  For "bt", we
see (as expected) the same output for both cases:

(gdb) bt
#0  0x000000000040110a in baz ()
    at worktree-ranges/gdb/testsuite/gdb.dwarf2/dw2-ranges-func-lo-cold.c:48
#1  0x0000000000401116 in foo ()
    at worktree-ranges/gdb/testsuite/gdb.dwarf2/dw2-ranges-func-lo-cold.c:54
#2  0x0000000000401138 in foo ()
    at worktree-ranges/gdb/testsuite/gdb.dwarf2/dw2-ranges-func-lo-cold.c:70
#3  0x0000000000401144 in main ()
    at worktree-ranges/gdb/testsuite/gdb.dwarf2/dw2-ranges-func-lo-cold.c:78

(I've shortened the paths for readability.)

The thing to note here is that the call of foo at frame #1 is actually a
call to foo_cold.  Showing foo_cold in the backtrace is the behavior that
Eli found objectionable.

Likewise, "up" shows the same behavior for both cases:

(gdb) up
#1  0x0000000000401116 in foo ()
    at worktree-ranges/gdb/testsuite/gdb.dwarf2/dw2-ranges-func-lo-cold.c:54
54	  baz ();					/* foo_cold baz call */

"x/5i" shows some differences in output.  I'll show the "prefer symtab
sym" version first, followed by the "prefer minsym" version second:

(gdb) x/5i $pc
=> 0x401116 <foo-10>:	nop
   0x401117 <foo-9>:	pop    %rbp
   0x401118 <foo-8>:	retq   
   0x401119 <bar>:	push   %rbp
   0x40111a <bar+1>:	mov    %rsp,%rbp

   --- versus ---

(gdb) x/5i $pc
=> 0x401116 <foo_cold+9>:	nop
   0x401117 <foo_cold+10>:	pop    %rbp
   0x401118 <foo_cold+11>:	retq   
   0x401119 <bar>:	push   %rbp
   0x40111a <bar+1>:	mov    %rsp,%rbp

The thing to observe in the above output is that offsets from foo are
used in the first case, where as offsets from foo_cold are shown for
the "prefer minsym" version.

Both versions show similar output for the "disassemble foo" command.
Here is the output for the "prefer minsym" version:

(gdb) disassemble foo
Dump of assembler code for function foo:
Address range 0x401120 to 0x40113b:
   0x0000000000401120 <+0>:	push   %rbp
   0x0000000000401121 <+1>:	mov    %rsp,%rbp
   0x0000000000401124 <+4>:	callq  0x401119 <bar>
   0x0000000000401129 <+9>:	mov    0x2ef1(%rip),%eax        # 0x404020 <e>
   0x000000000040112f <+15>:	test   %eax,%eax
   0x0000000000401131 <+17>:	je     0x401138 <foo+24>
   0x0000000000401133 <+19>:	callq  0x40110d <foo_cold>
   0x0000000000401138 <+24>:	nop
   0x0000000000401139 <+25>:	pop    %rbp
   0x000000000040113a <+26>:	retq   
Address range 0x40110d to 0x401119:
   0x000000000040110d <+0>:	push   %rbp
   0x000000000040110e <+1>:	mov    %rsp,%rbp
   0x0000000000401111 <+4>:	callq  0x401106 <baz>
=> 0x0000000000401116 <+9>:	nop
   0x0000000000401117 <+10>:	pop    %rbp
   0x0000000000401118 <+11>:	retq   
End of assembler dump.

The only line where there's a difference is:

   0x0000000000401133 <+19>:	callq  0x40110d <foo-19>

   --- versus ---

   0x0000000000401133 <+19>:	callq  0x40110d <foo_cold>

I think I prefer the negative offset in this case.

"x/i foo_cold" produces different outputs...

(gdb) x/i foo_cold
   0x40110d <foo-19>:	push   %rbp

   --- versus ---

(gdb) x/i foo_cold
   0x40110d <foo_cold>:	push   %rbp

The version that prefers the symtab sym shows foo versus foo_cold for
the version that prefers the minsym sym.

"disassemble foo_cold" shows the same output as "disassemble foo"
above.  I won't show it here since it's the same as what's shown
earlier.  I was sort of surprised that it showed the entire function
(both) ranges, but after thinking about it, this made sense since you
see the entire function when you disassemble some address that's in
the middle of the function.

My thoughts...

When I say "x/i foo_cold", I do think I'd prefer to see <foo_cold> instead
of <foo-19>.

However, when I do "x/5i $pc" after doing "up" from the baz frame, I think
I somewhat prefer seeing foo with negative offsets.

What would you think about this behavior?

(gdb) x/5i foo_cold
   0x40110d <foo_cold>:	push   %rbp
   0x40110e <foo-18>:	mov    %rsp,%rbp
   0x401111 <foo-15>:	callq  0x401106 <baz>
=> 0x401116 <foo-10>:	nop
   0x401117 <foo-9>:	pop    %rbp

I.e. prefer the minsym for offset 0, but use the function symbol for
the non-zero offsets.

Another possibility:

(gdb) x/5i foo_cold
   0x40110d <foo-19> <foo_cold>: push   %rbp
   0x40110e <foo-18>:	mov    %rsp,%rbp
   0x401111 <foo-15>:	callq  0x401106 <baz>
=> 0x401116 <foo-10>:	nop
   0x401117 <foo-9>:	pop    %rbp

I.e, show both the function symbol (plus/minus offset) AND the minsym,
but only show the minsym for the zero offset.

I haven't tried implementing either of these approaches yet, but
I can take a look at it if we have some concensus over what the output
should look like.

Kevin



More information about the Gdb-patches mailing list