Bug 28371 - gdb: watchpoint for local variable doesn't work on RISC-V
Summary: gdb: watchpoint for local variable doesn't work on RISC-V
Status: RESOLVED WONTFIX
Alias: None
Product: gdb
Classification: Unclassified
Component: gdb (show other bugs)
Version: unknown
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-09-22 07:00 UTC by shivam98.tkg@gmail.com
Modified: 2021-10-26 10:18 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2021-10-01 00:00:00
Project(s) to access:
ssh public key:


Attachments
test program (548 bytes, text/x-csrc)
2021-09-22 07:00 UTC, shivam98.tkg@gmail.com
Details
RISCV clang assembly (2.31 KB, text/plain)
2021-09-30 09:56 UTC, shivam98.tkg@gmail.com
Details
RISCV GCC Assembly (1.27 KB, text/plain)
2021-09-30 09:57 UTC, shivam98.tkg@gmail.com
Details
clang patch to generate cfi directive in epilogue for riscv backend (1.06 KB, patch)
2021-09-30 09:59 UTC, shivam98.tkg@gmail.com
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description shivam98.tkg@gmail.com 2021-09-22 07:00:12 UTC
Created attachment 13674 [details]
test program

Please consider the following program :-

  1 /* This testcase is part of GDB, the GNU debugger.
  2 
  3    Copyright 2010-2021 Free Software Foundation, Inc.
  4 
  5    This program is free software; you can redistribute it and/or modify
  6    it under the terms of the GNU General Public License as published by
  7    the Free Software Foundation; either version 3 of the License, or
  8    (at your option) any later version.
  9 
 10    This program is distributed in the hope that it will be useful,
 11    but WITHOUT ANY WARRANTY; without even the implied warranty of
 12    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13    GNU General Public License for more details.
 14 
 15    You should have received a copy of the GNU General Public License
 16    along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
 17 
 18 int global = 0;
 19 int global2 = 0;
 20 
 21 int func(int *foo)
 22 {
 23   (*foo)++;
 24   global++;
 25   global2++;
 26   return 0;
 27 }
 28 
 29 void func2(int *foo)
 30 {
 31   global2++;
 32 }
 33 
 34 int main()
 35 {
 36   int q = 0;
 37 
 38   func2 (&q);
 39   global2++;
 40 
 41   while (1)
 42     {
 43       func(&q);
 44     }
 45 
 46   return 0;
 47 }

Steps to produce the issue :- 

clang -g -static watch-cond.c -o watch

(clang is cross compile by following these instruction :- https://github.com/sifive/riscv-llvm#how-can-i-build-llvmclang-and-use-it-to-cross-compile-for-a-riscv-target)

qemu-riscv64 -g 1234 watch-cond
(start the server)

riscv64-unknown-elf-gdb watch-cond
(start client)

(gdb) target remote :1234
Remote debugging using :1234
0x00000000000100f8 in _start ()
(gdb) set can-use-hw-watchpoints 0
(gdb) b 36
Breakpoint 1 at 0x101e0: file watch-cond.c, line 36.
(gdb) n
Single stepping until exit from function _start,
which has no line number information.

Breakpoint 1, main () at watch-cond.c:36
36	  int q = 0;
(gdb) watch q
Watchpoint 2: q
(gdb) c
Continuing.

Watchpoint 2 deleted because the program has left the block in
which its expression is valid.
0x00000000000101cc in func2 (foo=0x0) at watch-cond.c:32
32	}
(gdb) disass main
Dump of assembler code for function main:
   0x00000000000101a0 <+0>:	addi	sp,sp,-32
   0x00000000000101a2 <+2>:	sd	ra,24(sp)
   0x00000000000101a4 <+4>:	sd	s0,16(sp)
   0x00000000000101a6 <+6>:	addi	s0,sp,32
   0x00000000000101a8 <+8>:	li	a0,0
   0x00000000000101aa <+10>:	sw	a0,-20(s0)
   0x00000000000101ae <+14>:	sw	a0,-24(s0)
   0x00000000000101b2 <+18>:	j	0x101b6 <main+22>
   0x00000000000101b6 <+22>:	lw	a1,-24(s0)
   0x00000000000101ba <+26>:	li	a0,4
   0x00000000000101bc <+28>:	blt	a0,a1,0x101d0 <main+48>
   0x00000000000101c0 <+32>:	j	0x101c4 <main+36>
   0x00000000000101c4 <+36>:	addi	a0,s0,-24
   0x00000000000101c8 <+40>:	jal	ra,0x10180 <func>
   0x00000000000101cc <+44>:	j	0x101b6 <main+22>
   0x00000000000101d0 <+48>:	li	a0,0
   0x00000000000101d2 <+50>:	ld	s0,16(sp)
   0x00000000000101d4 <+52>:	ld	ra,24(sp)
   0x00000000000101d6 <+54>:	addi	sp,sp,32
   0x00000000000101d8 <+56>:	ret
End of assembler dump.


(gdb)  p &q
$1 = (int *) 0x40008001d8

(gdb) info frame
Stack level 0, frame at 0x40008001f0:
 pc = 0x101e0 in main (watch-cond.c:36); saved pc = 0x10134
 source language c.
 Arglist at 0x40008001f0, args: 
 Locals at 0x40008001f0, Previous frame's sp is 0x40008001f0
 Saved registers:
  ra at 0x40008001e8, fp at 0x40008001e0, pc at 0x40008001e8Could not fetch register "ustatus"; remote failure reply 'E14'


objdump --disassemble watch-cond

00000000000101d2 <main>:
   101d2: 01 11        	addi	sp, sp, -32
   101d4: 06 ec        	sd	ra, 24(sp)
   101d6: 22 e8        	sd	s0, 16(sp)
   101d8: 00 10        	addi	s0, sp, 32
   101da: 01 45        	mv	a0, zero
   101dc: 23 26 a4 fe  	sw	a0, -20(s0)
   101e0: 23 24 a4 fe  	sw	a0, -24(s0)
   101e4: 13 05 84 fe  	addi	a0, s0, -24
   101e8: ef f0 df fc  	jal	0x101b4 <func2>
   101ec: 03 a5 41 f6  	lw	a0, -156(gp)
   101f0: 05 25        	addiw	a0, a0, 1
   101f2: 23 a2 a1 f6  	sw	a0, -156(gp)
   101f6: 6f 00 40 00  	j	0x101fa <main+0x28>
   101fa: 13 05 84 fe  	addi	a0, s0, -24
   101fe: ef f0 3f f8  	jal	0x10180 <func>
   10202: 6f f0 9f ff  	j	0x101fa <main+0x28>


I tested same the program debugging, compiled with gcc compiler, working fine.
Comment 1 shivam98.tkg@gmail.com 2021-09-23 04:03:17 UTC
When I compile the same testcase with gcc crosscompile for RISCV. Local variable watchpoint works so seems this is clang issue. 

After thinking for some time it seems scoping of local variable is not correct in 
clang/llvm riscv backend.
 
I would close issue later today if it fine/ or there will be no comment on this.
Comment 2 Jim Wilson 2021-09-24 22:51:36 UTC
I can reproduce natively on an Unmatched with the SiFive OpenEmbedded distro.  Looking at the gcc and clang output.  I see that the epilogue for func2 in the gcc output is

	ld	s0,24(sp)
	.cfi_restore 8
	.cfi_def_cfa 2, 32
	addi	sp,sp,32
	.cfi_def_cfa_offset 0
	jr	ra
	.cfi_endproc

And the epilogue for func2 in the clang output is

	ld	s0, 16(sp)                      # 8-byte Folded Reload
	ld	ra, 24(sp)                      # 8-byte Folded Reload
	addi	sp, sp, 32
	ret
.Ltmp3:
.Lfunc_end1:
	.size	func2, .Lfunc_end1-func2
	.cfi_endproc

So the clang output has no unwind info for epilogues.  I believe that is a known problem with clang.

Somehow this lack of epilogue unwind info is confusing gdb, and it thinks that the variable q has gone out of scope when it sees the frame pointer popped from the stack without corresponding unwind info.  This makes some sense, since gdb has to parse the frame to find the local q in the parent function, and if the unwind info is wrong it won't be able to do that.

I don't know if this can be fixed in gdb.  It can certainly be fixed in clang by emitting unwind info for function epilogues.

Also note that if you put a breakpoint in the func2 epilogue and type the "up" command, this works with the gcc output, taking you to main, but fails in the clang output, taking you to __libc_start_call_main which is a glibc function.  Epilogue unwind info is needed for this to work right also.

This whole mess is probably complicated by the fact that we don't have hardware watchpoints (except with openocd), so we have to single step through code and check for a change to q after every instruction.  If we had hardware watchpoints, the watchpoint for q would work correctly.  Without hardware watchpoints, we need function epilogue unwind info to be correct.
Comment 3 shivam98.tkg@gmail.com 2021-09-26 18:17:08 UTC
Many thanks for replying. I really forget the I should see the assembly generated by -S flag.

Yeah, so correct It seems the problem was, RISCV target didn't implement the support for adding CFI (canonical frame information) instructions in the epilogue. Specifically, say the following instruction/directive is missing in func and func2 functions - `.cfi_def_cfa %rsp, 8 `which is present in clang x86 assembly.  The support to inserts CFI instructions that set appropriate cfa offset(.cfi_def_cfa_offset) and cfa register(.cfi_def_cfa_register) should be added in emitEpilogue() in RISCVFrameLowering- .https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp#L535 . I will see how other backends are doing this and what should be done in the RISCV case and try to fix it. 

Thanks again. It saves a lot of effort for me.
Comment 4 shivam98.tkg@gmail.com 2021-09-26 18:39:50 UTC
sorry I miswrote  `.cfi_def_cfa_register %rbp` is missing in clang riscv assembly compare to clang x86 assembly not `.cfi_def_cfa %rsp, 8`.

And compare to gcc riscv assembly `.cfi_def_cfa_offset 0` is missing in clang riscv assembly.
Comment 5 Jim Wilson 2021-09-29 21:28:51 UTC
This was also filed as an LLVM bug
https://bugs.llvm.org/show_bug.cgi?id=51864

Can be fixed either by adding hardware watchpoint support to RISC-V GDB or by adding function epilog CFI info to the RISC-V LLVM port.  The latter one is easier.
Comment 6 Jessica Clarke 2021-09-29 21:42:31 UTC
Adding correct CFI is not as easy as you think. We used to emit such info for RISC-V but removed it as it made things worse and broke stack unwinding; see https://reviews.llvm.org/D69723.
Comment 7 Jessica Clarke 2021-09-29 21:42:58 UTC
(But, of course, patches welcome to implement it completely...)
Comment 8 shivam98.tkg@gmail.com 2021-09-30 09:56:37 UTC
Created attachment 13685 [details]
RISCV clang assembly
Comment 9 shivam98.tkg@gmail.com 2021-09-30 09:57:48 UTC
Created attachment 13686 [details]
RISCV GCC Assembly
Comment 10 shivam98.tkg@gmail.com 2021-09-30 09:59:10 UTC
Created attachment 13687 [details]
clang patch to generate cfi directive in epilogue for riscv backend
Comment 11 shivam98.tkg@gmail.com 2021-09-30 10:02:10 UTC
OK, applying the diff that reverts that D69723 patch didn't work. clang does emit function epilog CFI info. but debugging didn't work as expected.

clang new riscv assembly for function func2 :-

        .type   func2,@function
 59 func2:                                  # @func2
 60 .Lfunc_begin1:
 61         .loc    1 30 0                          # watch-cond.c:30:0
 62         .cfi_startproc
 63 # %bb.0:
 64         addi    sp, sp, -32
 65         .cfi_def_cfa_offset 32
 66         sd      ra, 24(sp)                      # 8-byte Folded Spill
 67         sd      s0, 16(sp)                      # 8-byte Folded Spill
 68         .cfi_offset ra, -8
 69         .cfi_offset s0, -16
 70         addi    s0, sp, 32
 71         .cfi_def_cfa s0, 0
 72         sd      a0, -24(s0)
 73 .Ltmp2:
 74         .loc    1 31 10 prologue_end            # watch-cond.c:31:10
 75         lui     a1, %hi(global2)
 76         lw      a0, %lo(global2)(a1)
 77         addiw   a0, a0, 1
 78         sw      a0, %lo(global2)(a1)
 79         .loc    1 32 1                          # watch-cond.c:32:1
 80         ld      s0, 16(sp)                      # 8-byte Folded Reload
 81         .cfi_def_cfa sp, -32
 82         ld      ra, 24(sp)                      # 8-byte Folded Reload
 83         .cfi_restore ra
 84         .cfi_restore s0
 85         addi    sp, sp, 32
 86         .cfi_def_cfa_offset 0
 87         ret
 88 .Ltmp3:
 89 .Lfunc_end1:
 90         .size   func2, .Lfunc_end1-func2
 91         .cfi_endproc
 92                                         # -- End function
 93         .globl  main                            # -- Begin function main
 94         .p2align        1
 95         .type   main,@function


Any Idea @Jim Wilson what else gdb expecting? Sorry for troubling you and the forum.

I attached the clang-riscv, clang-x86, and gcc-riscv assembly and patch diff file that I applied on llvm-project to generate the new assembly.
Comment 12 Andreas Schwab 2021-09-30 10:35:56 UTC
Try stepping through the epilogue with stepi and examine the output of info frame to see whether it correctly represents the state of the assembly at each insn, and whether backtrace works correctly from each insn.
Comment 13 Jim Wilson 2021-10-01 19:48:41 UTC
This line
    .cfi_def_cfa sp, -32
is wrong.  It should instead be
    .cfi_def_cfa sp, 32
This is emitted after loading the frame pointer from the stack.  Before that load, the frame is fp (s0) relative at an offset of 0, because fp is sp+32.  After the frame pointer is destroyed, we need an sp based frame, which is sp and an offset of 32.

Compare with the gcc output which has
	ld	s0,24(sp)
	.cfi_restore 8
	.cfi_def_cfa 2, 32
and note that sp is x2 and s0/fp is x8.

And I would second Andreas's recommendation to test this by single stepping through the function epilogue and checking the frame.  You should be able to go up and print the value of q.

Also the ".cfi_restore s0" in the clang output doesn't come immediately after the load, but that looks harmless, as both the in register and the on stack values are the same until the function exits, so it shouldn't matter exactly where this line is as long as it comes before the "addi sp,sp,32" since the stack could be clobbered by a signal after that point.
Comment 14 shivam98.tkg@gmail.com 2021-10-11 10:11:28 UTC
My debugging issue is now solved by reverting that patch with minor modification( though I have not committed any change to llvm-project), thanks everyone. 

but I think these cfi directives increase the size of the binary. And it will be better to add support for hardware watchpoints in gdb however it may be difficult in comparison.

If noone from gdb community comes then I will try this.
Comment 15 shivam98.tkg@gmail.com 2021-10-26 10:18:45 UTC
I close it here now and might open a new issue to track hardware watchpoint support in gdb.