[PATCHv5 09/11] gdbserver: update target description creation for x86/linux

Tue May 7 14:24:37 GMT 2024

"Willgerodt, Felix" <felix.willgerodt@intel.com> writes:

>> -----Original Message-----
>> From: Andrew Burgess <aburgess@redhat.com>
>> Sent: Freitag, 26. April 2024 17:02
>> To: gdb-patches@sourceware.org
>> Cc: Andrew Burgess <aburgess@redhat.com>; Willgerodt, Felix
>> <felix.willgerodt@intel.com>; John Baldwin <jhb@FreeBSD.org>
>> Subject: [PATCHv5 09/11] gdbserver: update target description creation for
>> x86/linux
>> 
>> This commit is part of a series which aims to share more of the target
>> description creation between GDB and gdbserver for x86/Linux.
>> 
>> After some refactoring earlier in this series the shared
>> x86_linux_tdesc_for_tid function was added into nat/x86-linux-tdesc.c.
>> However, this function still relies on amd64_linux_read_description
>> and i386_linux_read_description which are implemented separately for
>> both gdbserver and GDB.  Given that at their core, all these functions
>> do is:
>> 
>>   1. take an xcr0 value as input,
>>   2. mask out some feature bits,
>>   3. look for a cached pre-generated target description and return it
>>      if found,
>>   4. if no cached target description is found then call either
>>      amd64_create_target_description or
>>      i386_create_target_description to create a new target
>>      description, which is then added to the cache.  Return the newly
>>      created target description.
>> 
>> The inner functions amd64_create_target_description and
>> i386_create_target_description are already shared between GDB and
>> gdbserver (in the gdb/arch/ directory), so the only thing that
>> the *_read_description functions really do is add the caching layer,
>> and it feels like this really could be shared.
>> 
>> However, we have a small problem.
>> 
>> On the GDB side we create target descriptions using a different set of
>> cpu features than on the gdbserver side!  This means that for the
>> exact same target, we might get a different target description when
>> using native GDB vs using gdbserver.  This surely feels like a
>> mistake, I would expect to get the same target description on each.
>> 
>> The table below shows the number of possible different target
>> descriptions that we can create on the GDB side vs on the gdbserver
>> side for each target type:
>> 
>>         | GDB | gdbserver
>>   ------|-----|----------
>>   i386  | 64  | 7
>>   amd64 | 32  | 7
>>   x32   | 16  | 7
>
> I would have somehow expected to have the highest number for 64-bit
> CPUs. As it should include 32-bit and x32.
> Do you know why that isn't the case and 32-bit has double?
> Or is my problem that this is reflecting GDB code, which shares stuff
> between amd64 and i386?

The table reflects the different tdesc split by category, not the total
number that a user might get on a particular system.

So on an _actual_ i386 CPU, then sure, the user will (on the GDB side)
only have a choice of 64 different tdesc.

On an _actual_ amd64 CPU the user could run an i386 compiled binary, in
which case GDB will return one of those same 64 tdesc.  Or the user
might run an amd64 compiled binary, in which case GDB will return one of
the 32 amd64 tdesc.

So yes, from a user's point of view on amd64 there are 96 different
tdesc, but GDB splits these into 3 different categories (i386, amd64,
and x32), which is what this table represents.

> Maybe this is also solved with the comments I made in code below.
>
>
>> So in theory, all I want to do is move the GDB version
>> of *_read_description into the arch/ directory and have gdbserver use
>> that, then both GDB and gdbserver would be able to create any of the
>> possible target descriptions.
>> 
>> Unfortunately it's a little more complex than that due to the in
>> process agent (IPA).
>> 
>> When the IPA is in use, gdbserver sends a target description index to
>> the IPA, and the IPA uses this to find the correct target description
>> to use.
>> 
>> ** START OF AN ASIDE **
>> 
>> Back in the day I suspect this approach made perfect sense.  However
>> since this commit:
>> 
>>   commit a8806230241d201f808d856eaae4d44088117b0c
>>   Date:   Thu Dec 7 17:07:01 2017 +0000
>> 
>>       Initialize target description early in IPA
>> 
>> I think passing the index is now more trouble than its worth.
>> 
>> We used to pass the index, and then use that index to lookup which
>> target description to instantiate and use.  However, the above commit
>> fixed an issue where we can't call malloc() within (certain parts of)
>> the IPA (apparently), so instead we now pre-compute _every_ possible
>> target description within the IPA.  The index is now only used to
>> lookup which of the (many) pre-computed target descriptions to use.
>> 
>> It would (I think) have been easier all around if the IPA just
>> self-inspected, figured out its own xcr0 value, and used that to
>> create the one target description that is required.  So long as the
>> xcr0 to target description code is shared (at compile time) with
>> gdbserver, then we can be sure that the IPA will derive the same
>> target description as gdbserver, and we would avoid all this index
>> passing business, which has made this commit so very, very painful.
>> 
>> ** END OF AN ASIDE **
>> 
>> Currently then for x86/linux, gdbserver sends a number between 0 and 7
>> to the IPA, and the IPA uses this to create a target description.
>> 
>> However, I am proposing that gdbserver should now create one of (up
>> to) 64 different target descriptions for i386, so this 0 to 7 index
>> isn't going to be good enough any more (amd64 and x32 have slightly
>> fewer possible target descriptions, but still more than 8, so the
>> problem is the same).
>> 
>> For a while I wondered if I was going to have to try and find some
>> backward compatible solution for this mess.  But after seeing how
>> lightly the IPA is actually documented, I wonder if it is not the case
>> that there is a tight coupling between a version of gdbserver and a
>> version of the IPA?  At least I'm hoping so.
>> 
>> In this commit I have thrown out the old IPA target description index
>> numbering scheme, and switched to a completely new numbering scheme.
>> Instead of the index that is passed being arbitrary, the index is
>> instead calculated from the set of cpu features that are present on
>> the target.  Within the IPA we can then reverse this logic to recreate
>> the xcr0 value based on the index, and from the xcr0 value we can
>> choose the correct target description.
>> 
>> With the gdbserver to IPA numbering scheme issue resolved I have then
>> update the gdbserver versions of amd64_linux_read_description and
>> i386_linux_read_description so that they create target descriptions
>> using the same set of cpu features as GDB itself.
>> 
>> After this gdbserver should now always come up with the same target
>> description as GDB does on any x86/Linux target.
>> 
>> This commit does not introduce any new code sharing between GDB and
>> gdbserver as previous commits in this series have done.  Instead this
>> commit is all about bringing GDB and gdbserver into alignment
>> functionally so that the next commit(s) can merge the GDB and
>> gdbserver versions of these functions.
>> 
>> Approved-By: John Baldwin <jhb@FreeBSD.org>
>> ---
>>  gdbserver/linux-amd64-ipa.cc |  43 +----
>>  gdbserver/linux-i386-ipa.cc  |  23 +--
>>  gdbserver/linux-x86-low.cc   |  15 +-
>>  gdbserver/linux-x86-tdesc.cc | 316 +++++++++++++++++++++++++----------
>>  gdbserver/linux-x86-tdesc.h  |  49 +++---
>>  5 files changed, 278 insertions(+), 168 deletions(-)
>> 
>> diff --git a/gdbserver/linux-amd64-ipa.cc b/gdbserver/linux-amd64-ipa.cc
>> index df5e6aca081..0c80812cc6f 100644
>> --- a/gdbserver/linux-amd64-ipa.cc
>> +++ b/gdbserver/linux-amd64-ipa.cc
>> @@ -164,47 +164,21 @@ supply_static_tracepoint_registers (struct regcache
>> *regcache,
>> 
>>  #endif /* HAVE_UST */
>> 
>> -#if !defined __ILP32__
>> -/* Map the tdesc index to xcr0 mask.  */
>> -static uint64_t idx2mask[X86_TDESC_LAST] = {
>> -  X86_XSTATE_X87_MASK,
>> -  X86_XSTATE_SSE_MASK,
>> -  X86_XSTATE_AVX_MASK,
>> -  X86_XSTATE_MPX_MASK,
>> -  X86_XSTATE_AVX_MPX_MASK,
>> -  X86_XSTATE_AVX_AVX512_MASK,
>> -  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
>> -};
>> -#endif
>> -
>>  /* Return target_desc to use for IPA, given the tdesc index passed by
>>     gdbserver.  */
>> 
>>  const struct target_desc *
>>  get_ipa_tdesc (int idx)
>>  {
>> -  if (idx >= X86_TDESC_LAST)
>> -    {
>> -      internal_error ("unknown ipa tdesc index: %d", idx);
>> -    }
>> +  uint64_t xcr0 = x86_linux_tdesc_idx_to_xcr0 (idx);
>> 
>>  #if defined __ILP32__
>> -  switch (idx)
>> -    {
>> -    case X86_TDESC_SSE:
>> -      return amd64_linux_read_description (X86_XSTATE_SSE_MASK, true);
>> -    case X86_TDESC_AVX:
>> -      return amd64_linux_read_description (X86_XSTATE_AVX_MASK, true);
>> -    case X86_TDESC_AVX_AVX512:
>> -      return amd64_linux_read_description (X86_XSTATE_AVX_AVX512_MASK,
>> true);
>> -    default:
>> -      break;
>> -    }
>> +  bool is_x32 = true;
>>  #else
>> -  return amd64_linux_read_description (idx2mask[idx], false);
>> +  bool is_x32 = false;
>>  #endif
>> 
>> -  internal_error ("unknown ipa tdesc index: %d", idx);
>> +  return amd64_linux_read_description (xcr0, is_x32);
>>  }
>> 
>>  /* Allocate buffer for the jump pads.  The branch instruction has a
>> @@ -272,11 +246,10 @@ void
>>  initialize_low_tracepoint (void)
>>  {
>>  #if defined __ILP32__
>> -  amd64_linux_read_description (X86_XSTATE_SSE_MASK, true);
>> -  amd64_linux_read_description (X86_XSTATE_AVX_MASK, true);
>> -  amd64_linux_read_description (X86_XSTATE_AVX_AVX512_MASK, true);
>> +  for (auto i = 0; i < x86_linux_x32_tdesc_count (); i++)
>> +    amd64_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i), true);
>>  #else
>> -  for (auto i = 0; i < X86_TDESC_LAST; i++)
>> -    amd64_linux_read_description (idx2mask[i], false);
>> +  for (auto i = 0; i < x86_linux_amd64_tdesc_count (); i++)
>> +    amd64_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i), false);
>
> I know this was already here and I know there are different opinions on this.
> But to me using auto here (and in similar locations in this patch) is just wrong.
> So I would vote for making this a proper type.
> But this is a bit of my personal "pet peeve" ;)
>
>
>>  #endif
>>  }
>> diff --git a/gdbserver/linux-i386-ipa.cc b/gdbserver/linux-i386-ipa.cc
>> index aa346fc9bc3..c1c3152fb04 100644
>> --- a/gdbserver/linux-i386-ipa.cc
>> +++ b/gdbserver/linux-i386-ipa.cc
>> @@ -245,28 +245,15 @@ initialize_fast_tracepoint_trampoline_buffer (void)
>>      }
>>  }
>> 
>> -/* Map the tdesc index to xcr0 mask.  */
>> -static uint64_t idx2mask[X86_TDESC_LAST] = {
>> -  X86_XSTATE_X87_MASK,
>> -  X86_XSTATE_SSE_MASK,
>> -  X86_XSTATE_AVX_MASK,
>> -  X86_XSTATE_MPX_MASK,
>> -  X86_XSTATE_AVX_MPX_MASK,
>> -  X86_XSTATE_AVX_AVX512_MASK,
>> -  X86_XSTATE_AVX_MPX_AVX512_PKU_MASK,
>> -};
>> -
>>  /* Return target_desc to use for IPA, given the tdesc index passed by
>>     gdbserver.  */
>> 
>>  const struct target_desc *
>>  get_ipa_tdesc (int idx)
>>  {
>> -  if (idx >= X86_TDESC_LAST)
>> -    {
>> -      internal_error ("unknown ipa tdesc index: %d", idx);
>> -    }
>> -  return i386_linux_read_description (idx2mask[idx]);
>> +  uint64_t xcr0 = x86_linux_tdesc_idx_to_xcr0 (idx);
>> +
>> +  return i386_linux_read_description (xcr0);
>>  }
>> 
>>  /* Allocate buffer for the jump pads.  On i386, we can reach an arbitrary
>> @@ -288,6 +275,6 @@ void
>>  initialize_low_tracepoint (void)
>>  {
>>    initialize_fast_tracepoint_trampoline_buffer ();
>> -  for (auto i = 0; i < X86_TDESC_LAST; i++)
>> -    i386_linux_read_description (idx2mask[i]);
>> +  for (auto i = 0; i < x86_linux_i386_tdesc_count (); i++)
>> +    i386_linux_read_description (x86_linux_tdesc_idx_to_xcr0 (i));
>>  }
>> diff --git a/gdbserver/linux-x86-low.cc b/gdbserver/linux-x86-low.cc
>> index 68d2f13537c..6e23a53118b 100644
>> --- a/gdbserver/linux-x86-low.cc
>> +++ b/gdbserver/linux-x86-low.cc
>> @@ -2882,14 +2882,17 @@ x86_target::get_ipa_tdesc_idx ()
>>    struct regcache *regcache = get_thread_regcache (current_thread, 0);
>>    const struct target_desc *tdesc = regcache->tdesc;
>> 
>> +  if (!use_xml)
>> +    {
>> +      if (tdesc == tdesc_i386_linux_no_xml.get ()
>>  #ifdef __x86_64__
>> -  return amd64_get_ipa_tdesc_idx (tdesc);
>> -#endif
>> -
>> -  if (tdesc == tdesc_i386_linux_no_xml.get ())
>> -    return X86_TDESC_SSE;
>> +	  || tdesc == tdesc_amd64_linux_no_xml.get ()
>> +#endif /* __x86_64__ */
>> +	  )
>> +	return x86_linux_xcr0_to_tdesc_idx (X86_XSTATE_SSE_MASK);
>
> What happens if neither of the two is true? Do we need to assert this?
> Or do we need to just return the X86_XSTATE_SSE_MASK index
> without checking, as this can't ever happen?

You're right.  I've changed the 'if' into 'gdb_assert', and made the
return of an index based on X86_XSTATE_SSE_MASK unconditional (within
the '!use_xml' block.

>
>
>> +    }
>> 
>> -  return i386_get_ipa_tdesc_idx (tdesc);
>> +  return x86_linux_xcr0_to_tdesc_idx (xcr0_storage);
>>  }
>> 
>>  /* The linux target ops object.  */
>> diff --git a/gdbserver/linux-x86-tdesc.cc b/gdbserver/linux-x86-tdesc.cc
>> index af3735aa895..5e12526bf17 100644
>> --- a/gdbserver/linux-x86-tdesc.cc
>> +++ b/gdbserver/linux-x86-tdesc.cc
>> @@ -28,96 +28,278 @@
>>  #include "x86-tdesc.h"
>>  #include "arch/i386-linux-tdesc.h"
>> 
>> -/* Return the right x86_linux_tdesc index for a given XCR0.  Return
>> -   X86_TDESC_LAST if can't find a match.  */
>> +/* A structure used to describe a single cpu feature that might, or might
>> +   not, be checked for when creating a target description for one of i386,
>> +   amd64, or x32.  */
>> 
>> -static enum x86_linux_tdesc
>> -xcr0_to_tdesc_idx (uint64_t xcr0, bool is_x32)
>> +struct x86_tdesc_feature {
>> +  /* The cpu feature mask.  This is a mask against an xcr0 value.  */
>> +  uint64_t feature;
>
> Can we elaborate the comment? Why do we need to mask anything?
> Or maybe we can refer to the table below.
>
>
>> +  /* Is this feature checked when creating an i386 target description.  */
>> +  bool is_i386;
>> +
>> +  /* Is this feature checked when creating an amd64 target description.  */
>> +  bool is_amd64;
>> +
>> +  /* Is this feature checked when creating an x32 target description.  */
>> +  bool is_x32;
>> +};
>> +
>> +/* A constant table that describes all of the cpu features that are
>> +   checked when building a target description for i386, amd64, or x32.  */
>> +
>
> I am missing a bit of elaboration on why we can't only rely on XCR0. And
> the table doesn't describe all CPU features that are checked. It just describes
> all XSTATE features. There is also segments and orig_rax and in the near future
> a new register for CET, which are independent of XSTATE.

I can rename the data structures to make it clearer that this all
relates to xstate/xcr0.  I was aware about the segments flag that is
passed into the tdesc creation, but even with prompting I can't see
anything related to orig_rax -- how is that fed into the tdesc creation
process?

>
>
>> +static constexpr x86_tdesc_feature x86_linux_all_tdesc_features[] = {
>> +  /* Feature,           i386,	amd64,	x32.  */
>> +  { X86_XSTATE_PKRU,	true,	true, 	true },
>> +  { X86_XSTATE_AVX512,	true,	true, 	true },
>> +  { X86_XSTATE_AVX,	true,	true, 	true },
>> +  { X86_XSTATE_MPX,	true,	true, 	false },
>> +  { X86_XSTATE_SSE,	true,	false, 	false },
>> +  { X86_XSTATE_X87,	true,	false, 	false }
>> +};
>
> I have never looked at IPA code much. But this table looked a bit weird to me.
> I get that MPX is not supported for x32. Though MPX is already deprecated and will
> be removed once GDB 15 is branched (at least that is the plan).

You must excuse my lack of up to date knowledge of different x86
features, but is there not hardware in the wild with the MPX feature?
Wont removing support for this mean folk will be unable to debug on that
hardware?

> But why does amd64 not have x87 and SSE? I do see e.g. st0 and xmm0 on amd64.
>
> After digging a bit:
> In arch/amd64.c, we call create_feature_i386_64bit_sse() without any check for
> SSE in XCR0. And I see that we have st0 in the "core" registers with EAX etc.
> But in arch/i386.c it is different, and we add SSE based on X86_XSTATE_SSE.
> And while st0 is also in the "core" registers with EAX etc., we only add them based
> on X86_XSTATE_X87. To me this looks wrong. Why would CPUs without x87
> not add EAX? And even if it isn't for some reason, do we really want to support
> such old CPUs? In my view we could just align the two. 
>
> If we would do that, and deprecate MPX, the table looks unnecessary. Or did I miss
> something else?

That looks like a reasonable conclusion.

Honestly, pretty much everything in this patch is based on the idea of
merging gdbserver and GDB code, while keeping functionality as similar
as possible.

>
> This is a complicated series. Like others I am wondering how many users IPA
> really has and if it is worth maintaining. But I have no data. Is it in any distros?

I have no idea.  But I really don't want to tie this series (which fixes
actual bugs caused by tdesc mismatch between GDB and gdbserver) to the
removal of IPA.

As far as I can tell the IPA is tested as part of the gdb.trace/
sub-directory, so it's not like we're removing some untested broken
corner of GDB.  We'd be removing a somewhat tested, at least minimally
working, feature.

I'll take another pass over this patch, maybe I can convince myself that
we can just check for all features for all machine types, in which case
the table driven approach becomes unnecessary.

Also, maybe I can split the series in two.  Like so many of my series
it just seems to keep growing.  While addressing your earlier comments
I've already spotted another bug, which is going to push the patch count
to at least 12 *sigh*.

Lets see what I can come up with for v6!

Thanks,
Andrew

>
>
> Thanks,
> Felix
> Intel Deutschland GmbH
> Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
> Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
> Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
> Chairperson of the Supervisory Board: Nicole Lau
> Registered Office: Munich
> Commercial Register: Amtsgericht Muenchen HRB 186928