[EXT] Re: Semihosting in GDB 11.1 | Proposed patch for interrupting in sync mode

Thu Mar 10 11:30:51 GMT 2022

On 2022-03-10 10:02, Adrian Oltean wrote:

> As far as 'H' packet handling is concerned, note that the GDB server I'm referring is a proprietary implementation
> that is used for JTAG debugging. Threads are numbered from 1. However, in the old implementation "Hg0" was always
> switching to thread 1 ("first thread") internally, even though some other core/thread was suspended. The original
> implementation assumed "Hg0" must *always* switch to "first thread" (thread 1 in my case) but this doesn't seem
> to be enforced in the RSP specs. I updated the code and assimilate "arbitrary" (from spec) as "if there's no suspended
> thread/core internally selected, find a valid one".

So going back to your original problem description, you said:

~~~~
2. File I/O (semihosting) when a multicore (ARMv8) target is involved seems
to exhibit an issue. My multicore debugging model assumes that each core is a
separate thread. If I have 'printf' calls executed on separate cores, I noticed
that the messages are duplicated on secondary cores (other than core 0). The
remote log snippet that highlights the root cause is pasted below. Long story
short, the 'H' packet that is sent to the server and instructs it to use
thread 0 for some further operations, actually causes troubles because breakpoint
handling and run control gets broken afterwards. Note that the 'Fwrite' is the
result of a 'printf' on a secondary core, thus instructing the server to use
thread 0 breaks lots of things inside the GDB server. The 'H' packet seems to
be sent as a result of the 'continue' action and it translates into a call to
'switch_to_no_thread'. If I ignore the 'H' packet, semihosting breakpoints and run
control are correctly handled, so the second 'Fwrite' RSP sequence would not
exist (this is the expected behavior). Is the described behavior a known issue
inside GDB? Or do I have to change something in the server to correct the
described behavior?

[remote] Sending packet: $vCont;c#a8
[remote] Received Ack
[remote] wait: enter
  [remote] Packet received: Fwrite,1,80026300,12
  [remote] Sending packet: $Hg0#df
  [remote] Received Ack
  [remote] Packet received: OK
  [remote] Sending packet: $m80026300,12#8f
  [remote] Received Ack
  [remote] Packet received: 48656c6c6f2066726f6d20436f726523370a
  [remote] Sending packet: $F12#a9
  [remote] Received Ack
  [remote] Packet received: Fwrite,1,80026300,12
  [remote] Sending packet: $m80026300,12#8f
  [remote] Received Ack
  [remote] Packet received: 48656c6c6f2066726f6d20436f726523370a
  [remote] Sending packet: $F12#a9
  [remote] Received Ack
  [remote] Packet received: T05thread:p1.8;core:7;
[remote] wait: exit
~~~~

The printf calls end up calling write under the hood, of course, and those
syscalls are intercepted and translated to File I/O calls into GDB.  That's the
"Fwrite,1,80026300,12" packets.  Now, GDB processes those and writes to its
own stdout.  What is written to its stdout is 12 bytes found at 0x80026300 in
the inferior memory, as per the Fwrite packet's arguments.

Depending on the kind of target object is being accessed, GDB makes sure
to select the proper remote thread, or at least any thread of the right remote process.
This is important for when the remote side understands the multi-process extensions,
for example, lest GDB reads something of the wrong process.

So in remote.c, you'll see calls to set_general_process, like this:

  /* Make sure the remote is pointing at the right process.  */
  set_general_process ();

... which translate to "HgPID.0" when multi-process extensions are supported by
the server, and "Hg0" when they're not.

For memory reads in particular, GDB instead makes sure that GDB's selected thread
is selected on the server side as well, not just any thread of the gdb-selected process.
This is done here:

 enum target_xfer_status
 remote_target::xfer_partial (enum target_object object,
 			     const char *annex, gdb_byte *readbuf,
 			     const gdb_byte *writebuf, ULONGEST offset, ULONGEST len,
 			     ULONGEST *xfered_len)
 {
 ...
   set_general_thread (inferior_ptid);

So what I think is happening is that inferior_ptid is null_ptid when you get here.
We always switch to null_ptid when waiting for events out of the target, exactly
to better uncover spots that would be relying on some stale non-null inferior_ptid
by accident.  This would be such a case.  The inferior is running, and we're waiting
for events, so whatever was the gdb-selected thread (per inferior_ptid) has no bearing
on what thread is reporting some event.

If the server supports multi-process extensions, then that set_general_thread call above
with inferior_ptid == null_ptid will end up sending "Hg0.0", meaning, select any thread of any process,
which means the subsequent memory read (m packet) will potentially read memory off of
the wrong process.  This is obviously bad.

HOWEVER.  Note that the File I/O packets, like "Fwrite,1,80026300,12" don't say
at all which thread is doing the File I/O access!  IOW, this packet is not useable
in multi-process scenarios as is...  It would need to be extended to also pass down
an extra thread id field, like "Fwrite,1,80026300,12,pPID.TID".  FSF GDBserver doesn't
support File I/O, so these File I/O packets didn't get attention when the multi-process
extensions were devised.

So alright, the current "Hg0" packet you see in your logs is a bit spurious, but, it should
be harmless, and you should be able to select any valid thread, as per the Hg0 packet's
description, as all threads share the same address space, in GDB's view.

  [remote] Packet received: Fwrite,1,80026300,12
  [remote] Sending packet: $Hg0#df
  [remote] Received Ack
  [remote] Packet received: OK
  [remote] Sending packet: $m80026300,12#8f
  [remote] Received Ack
  [remote] Packet received: 48656c6c6f2066726f6d20436f726523370a
  [remote] Sending packet: $F12#a9

What I think you should do is, when you get the reply to Fwrite -- the "$F12#a9"
packet -- your stub should make sure that that resumes the same thread that initiated
the Fwrite, not whatever Hg selected.  That $F12 reply is linked with the last Fwrite,
there can only be one File I/O request ongoing at a given time.

Pedro Alves