[PATCH v2 11/13] gdb: make user-created frames reinflatable
Simon Marchi
simon.marchi@polymtl.ca
Mon Jan 30 16:20:19 GMT 2023
On 1/30/23 03:49, Luis Machado wrote:
> On 1/25/23 03:45, Simon Marchi wrote:
>>
>>
>> On 1/24/23 03:22, Luis Machado wrote:
>>> On 1/24/23 03:55, Simon Marchi wrote:
>>>>
>>>>
>>>> On 1/23/23 09:34, Luis Machado wrote:
>>>>> On 1/23/23 12:57, Tom de Vries wrote:
>>>>>> On 12/14/22 04:34, Simon Marchi via Gdb-patches wrote:
>>>>>>> gdb/testsuite/gdb.base/frame-view.exp | 47 ++++++++++++--
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> on aarch64-linux I get:
>>>>>> ...
>>>>>> FAIL: gdb.base/frame-view.exp: with_pretty_printer=true: frame (GDB internal error)
>>>>>> FAIL: gdb.base/frame-view.exp: with_pretty_printer=true: frame again (GDB internal error)
>>>>>
>>>>>
>>>>> I can reproduce this as well. It also happens on arm-linux Ubuntu 22.04/20.04, with a similar kind of backtrace.
>>>>
>>>> I can reproduce too, I have a potential fix. I'll try to send it
>>>> tomorrow.
>>>
>>> Let me know if I can help with testing it on one of the targets.
>>
>> Ok, so I have a patch that fixes the test, but I'm not sure it's really
>> good.
>>
>> First, here's my analysis of what I think is happening.
>>
>> - When we create the user frame (the "select-frame view" command), we
>> create a sentinel frame just for our user-created frame, in
>> create_new_frame. This sentinel frame has the same id as the regular
>> sentinel frame.
>> - When printing the frame, after doing the "select-frame view"
>> command, the argument's pretty printer is invoked, which does an
>> inferior function call (this is the point of the test). This clears
>> the frame cache, including the "real" sentinel frame, which sets the
>> sentinel_frame global to nullptr.
>> - Later in the frame-printing process (when printing the second
>> argument), the auto-reinflation mechanism (which would have been the
>> manual reinflate call before, doesn't matter here) re-creates the
>> user frame by calling create_new_frame again, creating its own
>> special sentinel frame again. However, note that the "real" sentinel
>> frame, the sentinel_frame global, is still nullptr. If the selected
>> frame had been a regular frame, we would have called
>> get_current_frame at some point during the reinflation, which would
>> have re-created the "real" sentinel frame. But it's not the case
>> here.
>> - Deep down the stack, something wants to fill in the unwind stop
>> reason for frame 0, which requires trying to unwind frame 1. This
>> leads us to trying to unwind the PC of frame 1:
>>
>> #0 gdbarch_unwind_pc (gdbarch=0xffff8d010080, next_frame=...) at /home/simark/src/binutils-gdb/gdb/gdbarch.c:2955
>> #1 0x000000000134569c in dwarf2_tailcall_sniffer_first (this_frame=..., tailcall_cachep=0xffff773fcae0, entry_cfa_sp_offsetp=0xfffff7f7d450)
>> at /home/simark/src/binutils-gdb/gdb/dwarf2/frame-tailcall.c:390
>> #2 0x0000000001355d84 in dwarf2_frame_cache (this_frame=..., this_cache=0xffff773fc928) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1089
>> #3 0x00000000013562b0 in dwarf2_frame_unwind_stop_reason (this_frame=..., this_cache=0xffff773fc928) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1101
>> #4 0x0000000001990f64 in get_prev_frame_always_1 (this_frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:2281
>> #5 0x0000000001993034 in get_prev_frame_always (this_frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:2376
>> #6 0x000000000199b814 in get_frame_unwind_stop_reason (frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:3051
>> #7 0x0000000001359cd8 in dwarf2_frame_cfa (this_frame=...) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1356
>> #8 0x000000000132122c in dwarf_expr_context::execute_stack_op (this=0xfffff7f80170, op_ptr=0xffff8d8883ee "\217\002", op_end=0xffff8d8883ee "\217\002")
>> at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:2110
>> #9 0x0000000001317b30 in dwarf_expr_context::eval (this=0xfffff7f80170, addr=0xffff8d8883ed "\234\217\002", len=1) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1239
>> #10 0x000000000131d68c in dwarf_expr_context::execute_stack_op (this=0xfffff7f80170, op_ptr=0xffff8d88840e "", op_end=0xffff8d88840e "") at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1811
>> #11 0x0000000001317b30 in dwarf_expr_context::eval (this=0xfffff7f80170, addr=0xffff8d88840c "\221p", len=2) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1239
>> #12 0x0000000001314c3c in dwarf_expr_context::evaluate (this=0xfffff7f80170, addr=0xffff8d88840c "\221p", len=2, as_lval=true, per_cu=0xffff90b03700, frame=..., addr_info=0x0,
>> type=0xffff8f6c8400, subobj_type=0xffff8f6c8400, subobj_offset=0) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1078
>> #13 0x000000000149f9e0 in dwarf2_evaluate_loc_desc_full (type=0xffff8f6c8400, frame=..., data=0xffff8d88840c "\221p", size=2, per_cu=0xffff90b03700, per_objfile=0xffff9070b980,
>> subobj_type=0xffff8f6c8400, subobj_byte_offset=0, as_lval=true) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1513
>> #14 0x00000000014a0100 in dwarf2_evaluate_loc_desc (type=0xffff8f6c8400, frame=..., data=0xffff8d88840c "\221p", size=2, per_cu=0xffff90b03700, per_objfile=0xffff9070b980, as_lval=true)
>> at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1557
>> #15 0x00000000014aa584 in locexpr_read_variable (symbol=0xffff8f6cd770, frame=...) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:3052
>> - AArch64 defines a special "prev register" function,
>> aarch64_dwarf2_prev_register, to handle unwinding the PC. This
>> function does
>>
>> frame_unwind_register_unsigned (this_frame, AARCH64_LR_REGNUM);
>>
>> (I must admit that I don't really understand this part, it seems to
>> me like you would want to unwind LR's value from this_frame->next...
>> but anyway.)
>
> It is somewhat historical, and it is something I'm planning to address eventually. Right now, though, we hardwire the PC to
> obtaining LR.
Ok, I'd be interesting in knowing more about the situation, for my own
general knowledge. I might be understanding the GDB unwinding wrong of
the ARM-specific unwinding wrong.
>> - frame_unwind_register_unsigned ultimately creates a lazy register
>> value, saving the frame id of this_frame->next. this_frame is the
>> user-created frame, to this_frame->next is the special sentinel frame
>> we created for it.
>> - When time comes to un-lazify the value, value_fetch_lazy_register
>> calls frame_find_by_id, to find the frame with the id we saved.
>> - frame_find_by_id sees it's the sentinel frame id, so returns the
>> sentinel_frame global, which is, if you remember, nullptr.
>> - We hit the `gdb_assert (next_frame != NULL)` assertion.
>>
>> I have a hard time pinpointing a single thing that is wrong. The fact
>> that the user-created frames have their own sentinel frame, which share
>> the same id as the real sentinel frame, that sounds like it can't work
>
> From your (very detailed) explanation, it does sound a bit iffy to abuse the frame id mechanism and use
> the same id as the existing real frame.
Yeah, my latest attempt at fixing it revolves around giving the
user-created frame's sentinels their own specific ID.
Simon
More information about the Gdb-patches
mailing list