[PATCH 09/12] gdb: move displaced stepping logic to gdbarch, allow starting concurrent displaced steps
Simon Marchi
simark@simark.ca
Mon Nov 30 20:26:55 GMT 2020
On 2020-11-26 9:24 a.m., Pedro Alves wrote:
> The function still throws errors anyway, e.g. a bit above:
>
> int status = target_read_memory (m_addr, m_saved_copy.data (), len);
> if (status != 0)
> throw_error (MEMORY_ERROR,
> _("Error accessing memory address %s (%s) for "
> "displaced-stepping scratch space."),
> paddress (arch, m_addr), safe_strerror (status));
>
> That's then caught by displaced_step_prepare, which results in
> the "disabling displaced stepping" warning, and GDB disabling
> displaced stepping for the inferior.
>
> Also, NOT_SUPPORTED_ERROR is handled the same way. See:
>
> commit 16b4184277c4ad5b4a20278060fd3f6259d1ed49
> Author: Pedro Alves <palves@redhat.com>
> AuthorDate: Tue Mar 15 16:33:04 2016 +0000
>
> Fix PR gdb/19676: Disable displaced stepping if /proc not mounted
>
> Any other error is just propagated out, likely all the way to
> the prompt:
>
> static displaced_step_prepare_status
> displaced_step_prepare (thread_info *thread)
> {
> ...
> catch (const gdb_exception_error &ex)
> {
> struct displaced_step_inferior_state *displaced_state;
>
> if (ex.error != MEMORY_ERROR
> && ex.error != NOT_SUPPORTED_ERROR)
> throw;
>
>> Would it make sense to remove the _ERROR status code and report all
>> errors using exceptions then?
>
> I don't think so, because there are 3 kinds of "errors":
>
> #1 - MEMORY and NOT_SUPPORTED errors result in disabling displaced stepping
> for the inferior permanently.
>
> #2 - other errors/exceptions propagate outwards.
>
> #3 - "prepare" returning "Can't do displace step for this instruction" for
> some reason. Like the backend not supporting displaced stepping
> that particular instruction, or the breakpoint address falling
> within the scratch pad. I.e., these cases:
>
> ...
> if (breakpoint_in_range_p (aspace, m_addr, len))
> {
> /* There's a breakpoint set in the scratch pad location range
> (which is usually around the entry point). We'd either
> install it before resuming, which would overwrite/corrupt the
> scratch pad, or if it was already inserted, this displaced
> step would overwrite it. The latter is OK in the sense that
> we already assume that no thread is going to execute the code
> in the scratch pad range (after initial startup) anyway, but
> the former is unacceptable. Simply punt and fallback to
> stepping over this breakpoint in-line. */
> displaced_debug_printf ("breakpoint set in scratch pad. "
> "Stepping over breakpoint in-line instead.");
>
> return DISPLACED_STEP_PREPARE_STATUS_ERROR;
> }
> ...
> m_copy_insn_closure = gdbarch_displaced_step_copy_insn (arch,
> m_original_pc,
> m_addr,
> regcache);
> if (m_copy_insn_closure == nullptr)
> {
> /* The architecture doesn't know how or want to displaced step
> this instruction or instruction sequence. Fallback to
> stepping over the breakpoint in-line. */
> return DISPLACED_STEP_PREPARE_STATUS_ERROR;
> }
>
> The "failed to write pc" try/catch case seems debatable, but I'd
> lean on it being case #2.
>
> I think that case #3 calls for renaming DISPLACED_STEP_PREPARE_STATUS_ERROR
> to something else to avoid confusion. Note that in master, -1 wasn't
> documented in terms of error:
>
> ...
> Returns 1 if preparing was successful -- this thread is going to be
> stepped now; 0 if displaced stepping this thread got queued; or -1
> if this instruction can't be displaced stepped. */
>
> static int
> displaced_step_prepare_throw (thread_info *tp)
Ok, thanks for this explanation. I didn't understand well the different
error modes (can't displaced-step this particular instruction vs some
other unexpected error) and how they were handled.
I'll rename DISPLACED_STEP_PREPARE_STATUS_ERROR to
DISPLACED_STEP_PREPARE_STATUS_CANT. It fits well with the comments in
the code where it is used, that all say variants of "can't displaced
step this instruction, fall back on in-line stepping".
>>>> - return false;
>>>> + /* If there are threads left in the THREADS_TO_STEP list, but we have
>>>> + detected that we can't start anything more, put back these threads
>>>> + in the global list. */
>>>
>>> Do we also need to do this if an exception happens to escape the function?
>>> We might end up pretty bonkers anyhow if that happens, though...
>>
>> Yes, I realized that when playing with this code again yesterday. A
>> scope_exit would help for this I think.
>>
>> But I'm also wondering if we should enqueue back the thread TP that was
>> dequeued, that caused an error. This one has already been dequeued from
>> THREADS_TO_STEP. For example, if thread_still_needs_step_over throws
>> (it reads the PC, so that could happen), should we put back the thread
>> in the global chain? If we do, and the same error happens over and
>> over, the thread will never leave the step over chain and be the first
>> in line, preventing any other displaced step to happen.
>>
>> If, on the other hand, the error was just an intermittent one and we
>> don't put it back in the queue, the next time we'll resume the thread
>> we'll realize it needs a step over and enqueue it again. So it sounds
>> less risky to me to just not enqueue back the thread on error. I'm
>> really not sure, I find it difficult to reason about all the possible
>> cases.
>
> Yeah, not enqueuing sounds OK. And indeed, error handling in all of
> infrun is very hard to reason about, and likely to end up with things
> messed up, e.g., with a thread marked as running when it is really
> stopped.
Ok, did that.
>> Also, re-reading that code, I noticed that we only return "true" when an
>> inline step over is started, or if a displaced step on an all-stop
>> target was started. That's the case even before the patch. I suppose
>> that's on purpose, because the caller wants to know whether it can
>> resume more stuff or not?
>
> Yes. If it returns true, then the caller shouldn't resume anything
> and go straight to waiting. ISTR having some trouble coming up with
> the right conditions at the callers. Maybe it could be simplified.
> I no longer remember why start_step_over returns false here:
>
> /* Don't start a new step-over if we already have an in-line
> step-over operation ongoing. */
> if (step_over_info_valid_p ())
> return false;
>
> Off hand, it seems like that should return true. But it may
> be that the intention was to only return true if anything
> that was stopped before was started. I'd need to experiment
> to know better now. :-/
Ok, I'll leave it as-is for now and treat is as a separete issue.
>>>> @@ -5293,25 +5226,23 @@ handle_inferior_event (struct execution_control_state *ecs)
>>>> {
>>>> struct regcache *regcache = get_thread_regcache (ecs->event_thread);
>>>> struct gdbarch *gdbarch = regcache->arch ();
>>>> + inferior *parent_inf = find_inferior_ptid (ecs->target, ecs->ptid);
>>>> +
>>>> + if (ecs->ws.kind == TARGET_WAITKIND_FORKED)
>>>> + {
>>>> + /* Restore in the child process any displaced stepping buffers that
>>>> + were in use at the time of the fork. */
>>>> + gdbarch_displaced_step_restore_all_in_ptid
>>>> + (gdbarch, parent_inf, ecs->ws.value.related_pid);
>>>> + }
>>>
>>> Why was this moved out of the displaced_step_in_progress_thread check
>>> below?
>>
>> If:
>>
>> 1. thread 1 is doing a displaced step
>> 2. thread 2 does a fork, not letting time for thread 1 to complete
>>
>> event_thread is thread 2, and it is not doing a displaced step, so we
>> don't enter the
>>
>> if (displaced_step_in_progress_thread (ecs->event_thread))
>>
>> But we still want to restore the bytes in the child used as the
>> displaced stepping buffer for thread 1. Does that make sense? Is it
>> possible that the current code is not correct in that regard?
>
> Indeed, looks like it. Good catch. How about fixing that
> in a separate patch, then, with its own commit log?
Will do. I didn't do a separate patch at first because I initially
thought: "now that we can do multiple displaced steps in parallel, it
can be that the other displaced steps does a fork!". I didn't think
right away of the case where it's a non-displaced-stepping thread who
does a fork.
Simon
More information about the Gdb-patches
mailing list