[PATCH] gdb: assert that we don't try to get a frame's id while it is computed

Simon Marchi simon.marchi@efficios.com
Sat Aug 22 15:19:35 GMT 2020


On 2020-08-22 6:05 a.m., Andrew Burgess wrote:
> * Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> [2020-08-20 16:11:48 -0400]:
> 
>> I'm dealing these days with a class of bugs that involve trying to get a
>> certain frame's id while we are in the process of computing it.  In other
>> words, compute_frame_id being called for a frame, eventually calling
>> get_frame_id for that same frame.  I don't think this is ever supposed to
>> happen, as that creates a cyclic dependency.
>>
>> Usually, these problems cause some failure down the line.  I'm proposing with
>> this patch to catch them as early as possible, as soon as the situation
>> described above happens.  I think that helps because the failed assertion will
>> be closer to the root of the problem.
>>
>> To do so, the patch changes the frame_info::this_id::p flag from a boolean (is
>> the frame id computed or not) to a tri-state:
>>
>> - the frame id is not computed
>> - the frame id is being computed
>> - the frame id is computed
>>
>> Then, we can properly assert that get_frame_id doesn't get called for a frame
>> whose id is being computed.
>>
>> To illustrate how that can help, let's imagine we apply the following change to
>> frame_unwind_got_optimized:
>>
>>     --- a/gdb/frame-unwind.c
>>     +++ b/gdb/frame-unwind.c
>>     @@ -260,8 +260,7 @@ frame_unwind_got_optimized (struct frame_info *frame, int regnum)
>>        mark_value_bytes_optimized_out (val, 0, TYPE_LENGTH (type));
>>        VALUE_LVAL (val) = lval_register;
>>        VALUE_REGNUM (val) = regnum;
>>     -  VALUE_NEXT_FRAME_ID (val)
>>     -    = get_frame_id (get_next_frame_sentinel_okay (frame));
>>     +  VALUE_NEXT_FRAME_ID (val) = get_frame_id (frame);
>>        return val;
>>      }
>>
>> ... and run the following command, which leads to a failed assertion (you need
>> to run the corresponding test to generate the binary first):
>>
>>     $ ./gdb -q -nx testsuite/outputs/gdb.dwarf2/dw2-undefined-ret-addr/dw2-undefined-ret-addr -ex "b stop_frame" -ex r
>>
>> Without this patch applied, we catch the issue indirectly, when the top-level
>> get_frame_id tries to stash the frame:
>>
>>     /home/smarchi/src/binutils-gdb/gdb/frame.c:593: internal-error: frame_id get_frame_id(frame_info*): Assertion `stashed' failed.
>>
>>     ...
>>     #9  0x0000000001af1c3a in internal_error (file=0x1cea060 "/home/smarchi/src/binutils-gdb/gdb/frame.c", line=593, fmt=0x1ce9f80 "%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:55
>>     #10 0x0000000000e9b413 in get_frame_id (fi=0x6210005105e0) at /home/smarchi/src/binutils-gdb/gdb/frame.c:593
>>     #11 0x0000000000e99e35 in scoped_restore_selected_frame::scoped_restore_selected_frame (this=0x7fff1d8b9760) at /home/smarchi/src/binutils-gdb/gdb/frame.c:308
>>     #12 0x000000000149a261 in print_frame_args (fp_opts=..., func=0x6210000dd7d0, frame=0x6210005105e0, num=-1, stream=0x60300008a580) at /home/smarchi/src/binutils-gdb/gdb/stack.c:750
>>     #13 0x000000000149d938 in print_frame (fp_opts=..., frame=0x6210005105e0, print_level=0, print_what=SRC_AND_LOC, print_args=1, sal=...) at /home/smarchi/src/binutils-gdb/gdb/stack.c:1394
>>     #14 0x000000000149c0c8 in print_frame_info (fp_opts=..., frame=0x6210005105e0, print_level=0, print_what=SRC_AND_LOC, print_args=1, set_current_sal=1) at /home/smarchi/src/binutils-gdb/gdb/stack.c:1119
>>     #15 0x0000000001498100 in print_stack_frame (frame=0x6210005105e0, print_level=0, print_what=SRC_AND_LOC, set_current_sal=1) at /home/smarchi/src/binutils-gdb/gdb/stack.c:366
>>     #16 0x00000000010234b7 in print_stop_location (ws=0x7fff1d8ba1f0) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:8366
>>     #17 0x000000000102362d in print_stop_event (uiout=0x607000018660, displays=true) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:8382
>>     ...
>>
>> It freaks out because the frame is already in the stash: it was added by an
>> inner call to get_frame_id, called indirectly by compute_frame_id.  Debugging
>> this failure is difficult because we have to backtrack to where this happened.
>>
>> With the patch applied, we catch the issue earlier, here:
>>
>>     /home/smarchi/src/binutils-gdb/gdb/frame.c:601: internal-error: frame_id get_frame_id(frame_info*): Assertion `fi->this_id.p != frame_id_status::COMPUTING' failed
>>
>>     ...
>>     #9  0x0000000001af22bc in internal_error (file=0x1cea6e0 "/home/smarchi/src/binutils-gdb/gdb/frame.c", line=601, fmt=0x1cea600 "%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:55
>>     #10 0x0000000000e9b7e3 in get_frame_id (fi=0x62100050dde0) at /home/smarchi/src/binutils-gdb/gdb/frame.c:601
>>     #11 0x0000000000e989b3 in frame_unwind_got_optimized (frame=0x62100050dde0, regnum=16) at /home/smarchi/src/binutils-gdb/gdb/frame-unwind.c:264
>>     #12 0x0000000000cbe386 in dwarf2_frame_prev_register (this_frame=0x62100050dde0, this_cache=0x62100050ddf8, regnum=16) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/frame.c:1267
>>     #13 0x0000000000e9f569 in frame_unwind_register_value (next_frame=0x62100050dde0, regnum=16) at /home/smarchi/src/binutils-gdb/gdb/frame.c:1266
>>     #14 0x0000000000e9eaab in frame_register_unwind (next_frame=0x62100050dde0, regnum=16, optimizedp=0x7ffca814ade0, unavailablep=0x7ffca814adf0, lvalp=0x7ffca814ae10, addrp=0x7ffca814ae20, realnump=0x7ffca814ae00, bufferp=0x7ffca814aec0 "") at /home/smarchi/src/binutils-gdb/gdb/frame.c:1169
>>     #15 0x0000000000e9f233 in frame_unwind_register (next_frame=0x62100050dde0, regnum=16, buf=0x7ffca814aec0 "") at /home/smarchi/src/binutils-gdb/gdb/frame.c:1225
>>     #16 0x0000000000f84262 in i386_unwind_pc (gdbarch=0x6210000eed10, next_frame=0x62100050dde0) at /home/smarchi/src/binutils-gdb/gdb/i386-tdep.c:1969
>>     #17 0x0000000000ec95dd in gdbarch_unwind_pc (gdbarch=0x6210000eed10, next_frame=0x62100050dde0) at /home/smarchi/src/binutils-gdb/gdb/gdbarch.c:3062
>>     #18 0x0000000000cb5e9d in dwarf2_tailcall_sniffer_first (this_frame=0x62100050dde0, tailcall_cachep=0x62100050dee0, entry_cfa_sp_offsetp=0x7ffca814b160) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/frame-tailcall.c:387
>>     #19 0x0000000000cbdd38 in dwarf2_frame_cache (this_frame=0x62100050dde0, this_cache=0x62100050ddf8) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/frame.c:1198
>>     #20 0x0000000000cbe026 in dwarf2_frame_this_id (this_frame=0x62100050dde0, this_cache=0x62100050ddf8, this_id=0x62100050de40) at /home/smarchi/src/binutils-gdb/gdb/dwarf2/frame.c:1226
>>     #21 0x0000000000e9b447 in compute_frame_id (fi=0x62100050dde0) at /home/smarchi/src/binutils-gdb/gdb/frame.c:580
>>     #22 0x0000000000e9b89e in get_frame_id (fi=0x62100050dde0) at /home/smarchi/src/binutils-gdb/gdb/frame.c:613
>>     #23 0x0000000000e99e35 in scoped_restore_selected_frame::scoped_restore_selected_frame (this=0x7ffca814b610) at /home/smarchi/src/binutils-gdb/gdb/frame.c:315
>>     #24 0x000000000149a8e3 in print_frame_args (fp_opts=..., func=0x6210000dd7d0, frame=0x62100050dde0, num=-1, stream=0x60300008a520) at /home/smarchi/src/binutils-gdb/gdb/stack.c:750
>>     #25 0x000000000149dfba in print_frame (fp_opts=..., frame=0x62100050dde0, print_level=0, print_what=SRC_AND_LOC, print_args=1, sal=...) at /home/smarchi/src/binutils-gdb/gdb/stack.c:1394
>>     #26 0x000000000149c74a in print_frame_info (fp_opts=..., frame=0x62100050dde0, print_level=0, print_what=SRC_AND_LOC, print_args=1, set_current_sal=1) at /home/smarchi/src/binutils-gdb/gdb/stack.c:1119
>>     #27 0x0000000001498782 in print_stack_frame (frame=0x62100050dde0, print_level=0, print_what=SRC_AND_LOC, set_current_sal=1) at /home/smarchi/src/binutils-gdb/gdb/stack.c:366
>>     #28 0x0000000001023b39 in print_stop_location (ws=0x7ffca814c0a0) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:8366
>>     #29 0x0000000001023caf in print_stop_event (uiout=0x607000018660, displays=true) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:8382
>>     ...
>>
>> Now, we can clearly see that get_frame_id for frame `fi=0x62100050dde0` gets
>> called while compute_frame_id is active for that frame.  The backtrace is more
>> helpful to identify the root of the problem.
>>
>> gdb/ChangeLog:
>>
>> 	* frame.c (enum class frame_id_status): New.
>> 	(struct frame_info) <this_id::p>: Change type to frame_id_status.
>> 	(fprintf_frame): Update.
>> 	(compute_frame_id): Set frame id status to "computing" on entry.
>> 	Set it back to "not_computed" on failure and to "computed" on
>> 	success.
>> 	(get_frame_id): Assert the frame id is not being computed.
>> 	(create_sentinel_frame): Use frame_id_status::COMPUTED.
>> 	(create_new_frame): Likewise.
>> 	(frame_cleanup_after_sniffer): Update assert.
> 
> LGTM.
> 
> Thanks,
> Andrew

Thanks both, I pushed the patch.

Simon


More information about the Gdb-patches mailing list