RFC gdbserver and tdesc and powerpc (stuck on a gdbserver assert)

Wed Feb 2 01:27:12 GMT 2022

On 2022-02-01 16:48, will schmidt via Gdb-patches wrote:
> 
> Hi, 
>    I've been working on a target-description rework for powerpc, this is
> a continuation of work that Rogerio has posted rfc patches for sometime last year.
> 
> I've run into a stumbling block with the init_target_desc code in gdbserver,
> and am not sure how to best proceed.
> 
>   gdbserver/tdesc.cc:  init_target_desc(...)  iterates through the provided
> features and populates the reg_defs structure.  The code currently has an assert
> with a comment:
> 
>         /* Register number will increase (possibly with gaps) or be zero.  */
>         gdb_assert (regnum == 0 || regnum >= tdesc->reg_defs.size ());
> 
> This trips on powerpc (with the WIP tdesc patch set), potentially in several
> locations, since our features contain registers that are intermixed across the
> ranges, so we end up with regnos that numerically belong earlier in the
> tdesc->reg_defs structure, but they belong in the features where they are.
> In particular; 
> The Powerpc "core" features includes regnums 0-31 (gprs r0..r31),
> a gap, then 64-69 (PC,MSR,CR,LR,CTR,XER).
> The subsequent "fpu" feature fills in that gap as it includes regnums
> 32-63 (f0..f31), and 70 (fpscr).

>From a quick look at the code, my understanding is:

 - The order in which the registers appear in the target description
   defines the order of the registers in the regcache buffer.  See how
   `offset` gets incremented for each register in that loop you were
   looking at?

 - When handling the 'g' packet (read registers), there is currently the
   assumption that the order of the registers in the reg_defs vector
   (that is, sorted by register number, as reg_defs is indexed by
   register number) matches the order of the registers in the regcache
   buffer.  Look in registers_to_string, how we grab a pointer to the
   beginning of the regcache buffer and only advance it as we iterate on
   reg_defs.

So, if you were to fill reg_defs in a non-sorted order:

 - regnum 0, name=A, size=4
 - regnum 2, name=C, size=4
 - regnum 1, name=B, size=4

such that you ended up with this reg_defs:

 [0] name=A, offset=0
 [1] name=B, offset=8
 [2] name=C, offset=4

Then registers_to_string would get it wrong, as it would mix up B and C.

I think what you are trying to do is possible, it's just a matter of
either respecting the existing assumptions the code is based on, or
adjust the code to not assume these things anymore.  For this specific
problem, I see two possible solutions:

 - When filling reg_defs, do a first pass to create all elements.  Then
   do a second pass, iterating on reg_defs, to fill in the offset
   fields.

 - Remove the assumption that the order of the registers in the regcache
   is the same as in reg_defs.  In registers_to_string, that would mean
   not advancing a pointer in the regcache buffer, but using each
   register's offset to grab the data at the right place in the
   regcache buffer.  registers_from_string would need to be adapted too,
   which seems harder, as it's currently implemented as one big
   hex2bin, which assumes the regcache has registers sorted by number.

I would lean towards the first solution, it seems easier.

> There may or may not be an issue with the subsequent altivec and vsx register sets,
> since we have some overlapping ranges there.   

Can a single tdesc actually contain two registers with the same number?

> I could split apart the features into smaller bits, but this would scramble the
> documented powerpc features descriptions (as seen in gdb.texinfo).   
> I've tried just disabling the assert, but I'm not certain that is sufficient, I currently
> also see some partial transfer errors between gdb and gdbserver that i've not sorted out.

Right after that assert, we have:

	/* Register number will increase (possibly with gaps) or be zero.  */
	gdb_assert (regnum == 0 || regnum >= tdesc->reg_defs.size ());

	if (regnum != 0)
	  tdesc->reg_defs.resize (regnum, gdb::reg (offset));

So if you handle registers 0, then 100, then 1, the resize will make it
so the reg_defs vector ends up with a size of 1 (and subsequently 2 when
we emplace_back).  Whereas what you want is a vector of size 101 with
all undefined registers except at indices 0, 1 and 100.

So to implement the idea described above, you'd need to change that code
to make sure you only grow the vector.

Simon