[PATCH v3 1/3] arc: Add ARCv2 XML target along with refactoring

Simon Marchi simark@simark.ca
Wed Jul 15 02:52:12 GMT 2020


Hi Shahab,

Again, I'm a bit clueless for the ARC-specific stuff, I'll assume it's all right.

I just put a few minor comments, but it looks like the feature-based target description
thing is well done.

> @@ -91,63 +93,201 @@ int arc_debug;
>  
>  static struct cmd_list_element *maintenance_print_arc_list = NULL;
>  
> -/* XML target description features.  */
> -
> -static const char core_v2_feature_name[] = "org.gnu.gdb.arc.core.v2";
> -static const char
> -  core_reduced_v2_feature_name[] = "org.gnu.gdb.arc.core-reduced.v2";
> -static const char
> -  core_arcompact_feature_name[] = "org.gnu.gdb.arc.core.arcompact";
> -static const char aux_minimal_feature_name[] = "org.gnu.gdb.arc.aux-minimal";
> -
> -/* XML target description known registers.  */
> -
> -static const char *const core_v2_register_names[] = {
> -  "r0", "r1", "r2", "r3",
> -  "r4", "r5", "r6", "r7",
> -  "r8", "r9", "r10", "r11",
> -  "r12", "r13", "r14", "r15",
> -  "r16", "r17", "r18", "r19",
> -  "r20", "r21", "r22", "r23",
> -  "r24", "r25", "gp", "fp",
> -  "sp", "ilink", "r30", "blink",
> -  "r32", "r33", "r34", "r35",
> -  "r36", "r37", "r38", "r39",
> -  "r40", "r41", "r42", "r43",
> -  "r44", "r45", "r46", "r47",
> -  "r48", "r49", "r50", "r51",
> -  "r52", "r53", "r54", "r55",
> -  "r56", "r57", "accl", "acch",
> -  "lp_count", "reserved", "limm", "pcl",
> +/* A set of registers that we expect to find in a tdesc_feature.  These
> +   are used in ARC_TDESC_INIT when processing the target description.  */
> +
> +struct arc_register_feature
> +{
> +  /* Information for a single register.  */
> +  struct register_info
> +  {
> +    /* The GDB register number for this register.  */
> +    int regnum;
> +
> +    /* List of names for this register.  The first name in this list is the
> +       preferred name, the name GDB will use when describing this register.  */
> +    std::vector<const char *> names;
> +
> +    /* When true, this register must be present in this feature set.  */
> +    bool required_p;
> +  };
> +
> +  /* The name for this feature.  This is the name used to find this feature
> +     within the target description.  */
> +  const char *name;
> +
> +  /* List of all the registers that we expect to encounter in this register
> +     set.  */
> +  std::vector<struct register_info> registers;
>  };
>  
> -static const char *const aux_minimal_register_names[] = {
> -  "pc", "status32",
> +static const char *ARC_CORE_FEATURE_NAME="org.gnu.gdb.arc.core";
> +static const char *ARC_AUX_FEATURE_NAME="org.gnu.gdb.arc.aux";

Spaces around `=`.

> @@ -1717,192 +1857,227 @@ static const struct frame_base arc_normal_base = {
>    arc_frame_base_address
>  };
>  
> -/* Initialize target description for the ARC.
> -
> -   Returns TRUE if input tdesc was valid and in this case it will assign TDESC
> -   and TDESC_DATA output parameters.  */
> -
> -static bool
> -arc_tdesc_init (struct gdbarch_info info, const struct target_desc **tdesc,
> -		struct tdesc_arch_data **tdesc_data)
> +static inline enum
> +arc_isa mach_type_to_arc_isa (const unsigned long mach)

`arc_isa` goes on previous line.  I'd drop the `inline`, the compiler will take
care of that if needed.

>  {
> -  if (arc_debug)
> -    debug_printf ("arc: Target description initialization.\n");
> -
> -  const struct target_desc *tdesc_loc = info.target_desc;
> -
> -  /* Depending on whether this is ARCompact or ARCv2 we will assign
> -     different default registers sets (which will differ in exactly two core
> -     registers).  GDB will also refuse to accept register feature from invalid
> -     ISA - v2 features can be used only with v2 ARChitecture.  We read
> -     bfd_arch_info, which looks like to be a safe bet here, as it looks like it
> -     is always initialized even when we don't pass any elf file to GDB at all
> -     (it uses default arch in this case).  Also GDB will call this function
> -     multiple times, and if XML target description file contains architecture
> -     specifications, then GDB will set this architecture to info.bfd_arch_info,
> -     overriding value from ELF file if they are different.  That means that,
> -     where matters, this value is always our best guess on what CPU we are
> -     debugging.  It has been noted that architecture specified in tdesc file
> -     has higher precedence over ELF and even "set architecture" - that is,
> -     using "set architecture" command will have no effect when tdesc has "arch"
> -     tag.  */
> -  /* Cannot use arc_mach_is_arcv2 (), because gdbarch is not created yet.  */
> -  const int is_arcv2 = (info.bfd_arch_info->mach == bfd_mach_arc_arcv2);
> -  bool is_reduced_rf;
> -  const char *const *core_regs;
> -  const char *core_feature_name;
> +  switch (mach)
> +    {
> +    case bfd_mach_arc_arc600:
> +    case bfd_mach_arc_arc601:
> +    case bfd_mach_arc_arc700:
> +      return ARC_ISA_ARCV1;
> +      break;
> +    case bfd_mach_arc_arcv2:
> +      return ARC_ISA_ARCV2;
> +      break;
> +    default:
> +	internal_error (__FILE__, __LINE__,
> +			_("unknown machine id %lu"), mach);
> +    }
> +  return ARC_ISA_NONE;
> +}
>  
> -  /* If target doesn't provide a description, use the default ones.  */
> -  if (!tdesc_has_registers (tdesc_loc))
> +/* Common construction code for ARC_GDBARCH_FEATURES struct.  If there
> +   is no ABFD, then a FEATURE with default values is returned.  */
> +void arc_gdbarch_features_init (arc_gdbarch_features &features,
> +				const bfd *abfd, const unsigned long mach)

Since this function is only used in this file, can it be static?

Could this function return a arc_gdbarch_features by value instead?  Then,
you could add a constructor to arc_gdbarch_features.  You wouldn't need
the "A 0 indicates an uninitialised value" comment on reg_size, as the
object could never be in an uninitialised state.

> +{
> +  /* Try to guess the features parameters by looking at the binary to be
> +     executed.  If the user is providing a binary that does not match the
> +     target, then tough luck.  This is the last effort to makes sense of
> +     what's going on.  */
> +  if (abfd != NULL && bfd_get_flavour (abfd) == bfd_target_elf_flavour)
>      {
> -      if (is_arcv2)
> -	tdesc_loc = arc_read_description (ARC_SYS_TYPE_ARCV2);
> +      unsigned char eclass = elf_elfheader (abfd)->e_ident[EI_CLASS];
> +
> +      if (eclass == ELFCLASS32)
> +	features.reg_size = 4;
> +      else if (eclass == ELFCLASS64)
> +	features.reg_size = 8;
>        else
> -	tdesc_loc = arc_read_description (ARC_SYS_TYPE_ARCOMPACT);
> +	internal_error (__FILE__, __LINE__,
> +			_("unknown ELF header class %d"), eclass);
>      }
> -  else
> +
> +  /* MACH from a bfd_arch_info struct is used here.  It should be a safe
> +     bet, as it looks like the struct is always initialized even when we
> +     don't pass any elf file to GDB at all (it uses default arch in that
> +     case).  */
> +  features.isa = mach_type_to_arc_isa (mach);
> +
> +  /* Put the most frequent values for the undetermined parameters.  */
> +  if (features.reg_size == 0)
> +    features.reg_size = 4;

Can you add an empty line here to separate the two `if`s?

> +  if (features.isa == ARC_ISA_NONE)
> +    features.reg_size = ARC_ISA_ARCV2;

Did you mean to assign features.isa on the last line?

> diff --git a/gdb/arch/arc.c b/gdb/arch/arc.c
> index 9552b4aff97..84bc8df7788 100644
> --- a/gdb/arch/arc.c
> +++ b/gdb/arch/arc.c
> @@ -17,42 +17,104 @@
>  
>  
>  #include "gdbsupport/common-defs.h"
> -#include <stdlib.h>
> -
>  #include "arc.h"
> +#include <stdlib.h>
> +#include <unordered_map>
> +#include <sstream>
>  
>  /* Target description features.  */
> -#include "features/arc/core-v2.c"
> -#include "features/arc/aux-v2.c"
> -#include "features/arc/core-arcompact.c"
> -#include "features/arc/aux-arcompact.c"
> +#include "features/arc/v1-core.c"
> +#include "features/arc/v1-aux.c"
> +#include "features/arc/v2-core.c"
> +#include "features/arc/v2-aux.c"
>  
> -/* See arc.h.  */
> +#ifndef GDBSERVER
> +#define STATIC_IN_GDB static
> +#else
> +#define STATIC_IN_GDB
> +#endif
>  
> -target_desc *
> -arc_create_target_description (arc_sys_type sys_type)
> +STATIC_IN_GDB target_desc *
> +arc_create_target_description (const struct arc_gdbarch_features &features)
>  {
> +  /* Create a new target description.  */
>    target_desc *tdesc = allocate_target_description ();
>  
> -  long regnum = 0;
> -
>  #ifndef IN_PROCESS_AGENT
> -  if (sys_type == ARC_SYS_TYPE_ARCV2)
> -    set_tdesc_architecture (tdesc, "arc:ARCv2");
> -  else
> -    set_tdesc_architecture (tdesc, "arc:ARC700");
> -#endif
> +  std::string arch_name = "arc";
>  
> -  if (sys_type == ARC_SYS_TYPE_ARCV2)
> +  /* Architecture names here must match the ones in
> +     ARCH_INFO_STRUCT in bfd/cpu-arc.c.  */
> +  if (features.isa == ARC_ISA_ARCV1 && features.reg_size == 4)
> +      arch_name.append (":ARC700");
> +  else if (features.isa == ARC_ISA_ARCV2 && features.reg_size == 4)
> +      arch_name.append (":ARCv2");

It's really a nit, but it would be more efficient to just use two constant
strings:

const char *arch_name;

if (a)
  arch_name = "arc:ARC700";
else if (b)
  arch_name = "arc:ARCv2";
else
  assert

> +  else
>      {
> -      regnum = create_feature_arc_core_v2 (tdesc, regnum);
> -      regnum = create_feature_arc_aux_v2 (tdesc, regnum);
> +      std::ostringstream msg;
> +      msg << "Cannot determine architecture: ISA=" << features.isa
> +	  << "; bitness=" << 8*features.reg_size;

I'd suggest using string_printf instead, throughout the patch where you use
ostringstream.

> +      gdb_assert_not_reached (msg.str ().c_str ());
>      }
> -  else
> +
> +  set_tdesc_architecture (tdesc, arch_name.c_str ());
> +#endif
> +
> +  long regnum = 0;
> +
> +  switch (features.isa)
>      {
> -      regnum = create_feature_arc_core_arcompact (tdesc, regnum);
> -      regnum = create_feature_arc_aux_arcompact (tdesc, regnum);
> +    case ARC_ISA_ARCV1:
> +	  regnum = create_feature_arc_v1_core (tdesc, regnum);
> +	  regnum = create_feature_arc_v1_aux (tdesc, regnum);
> +	  break;
> +    case ARC_ISA_ARCV2:
> +	  regnum = create_feature_arc_v2_core (tdesc, regnum);
> +	  regnum = create_feature_arc_v2_aux (tdesc, regnum);
> +	  break;
> +    default:
> +	  std::ostringstream msg;
> +	  msg << "Cannot choose target description XML: " << features.isa;
> +	  gdb_assert_not_reached (msg.str ().c_str ());
>      }
>  
>    return tdesc;
>  }
> +
> +#ifndef GDBSERVER
> +
> +/* Wrapper used by std::unordered_map to generate hash for features set.  */
> +struct arc_gdbarch_features_hasher
> +{
> +  std::size_t
> +  operator() (const arc_gdbarch_features &features) const noexcept
> +  {
> +    return features.hash ();
> +  }
> +};
> +
> +/* Cache of previously created target descriptions, indexed by the hash
> +   of the features set used to create them.  */
> +static std::unordered_map<arc_gdbarch_features,
> +			  const target_desc *,
> +			  arc_gdbarch_features_hasher> arc_tdesc_cache;

Should this map hold unique_ptr of target_desc?  I understand that they
are never really deallocated anyway, but just to do it right and avoid
adding more leaks to the final AddressSanitizer report shown when you exit
GDB (and you have it build with AddressSanitizer)?

> +
> +/* See arch/arc.h.  */
> +
> +const target_desc *
> +arc_lookup_target_description (const struct arc_gdbarch_features &features)
> +{
> +  /* Lookup in the cache first.  */
> +  const auto it = arc_tdesc_cache.find (features);
> +  if (it != arc_tdesc_cache.end ())
> +    return it->second;
> +
> +  target_desc *tdesc = arc_create_target_description (features);
> +
> +  /* Add the newly created target description to the repertoire.  */
> +  arc_tdesc_cache.emplace (features, tdesc);
> +
> +  return tdesc;
> +}
> +
> +#endif /* !GDBSERVER */
> diff --git a/gdb/arch/arc.h b/gdb/arch/arc.h
> index fd806ae7d34..4f400c93a7d 100644
> --- a/gdb/arch/arc.h
> +++ b/gdb/arch/arc.h
> @@ -20,29 +20,65 @@
>  
>  #include "gdbsupport/tdesc.h"
>  
> -/* Supported ARC system hardware types.  */
> -enum arc_sys_type
> +/* Supported ARC ISAs.  */
> +enum arc_isa
>  {
> -  ARC_SYS_TYPE_ARCOMPACT = 0,	  /* ARC600 or ARC700 */
> -  ARC_SYS_TYPE_ARCV2,		  /* ARC EM or ARC HS */
> -  ARC_SYS_TYPE_NUM
> +  ARC_ISA_NONE = 0,
> +  ARC_ISA_ARCV1,    /* a.k.a. ARCompact (ARC600, ARC700)  */
> +  ARC_ISA_ARCV2	    /* such as ARC EM and ARC HS  */
>  };
>  
> -static inline const char *
> -arc_sys_type_to_str (const arc_sys_type type)
> +struct arc_gdbarch_features
>  {
> -  switch (type)
> -    {
> -    case ARC_SYS_TYPE_ARCOMPACT:
> -      return "ARC_SYS_TYPE_ARCOMPACT";
> -    case ARC_SYS_TYPE_ARCV2:
> -      return "ARC_SYS_TYPE_ARCV2";
> -    default:
> -      return "Invalid";
> -    }
> -}
> -
> -/* Create target description for the specified system type.  */
> -target_desc *arc_create_target_description (arc_sys_type sys_type);
> +  /* Register size in bytes.  Possible values are 4, and 8.  A 0 indicates
> +     an uninitialised value.  */
> +  int reg_size = 0;
> +
> +  /* See ARC_ISA enum.  */
> +  int isa = ARC_ISA_NONE;
> +
> +  /* Equality operator.  */
> +  bool operator== (const struct arc_gdbarch_features &rhs) const
> +  {
> +    return (reg_size == rhs.reg_size && isa == rhs.isa);
> +  }
> +
> +  /* Inequality operator.  */
> +  bool operator!= (const struct arc_gdbarch_features &rhs) const
> +  {
> +    return !(*this == rhs);
> +  }
> +
> +  /* Used by std::unordered_map to hash the feature sets.  The hash is
> +     calculated in the manner below:
> +     REG_SIZE |  ISA
> +      5-bits  | 4-bits  */
> +
> +  std::size_t hash () const noexcept
> +  {
> +    std::size_t val = ((reg_size & 0x1f) << 8 | (isa & 0xf) << 0);
> +    return val;

I don't know much about hashing, but I don't think makes for a very good hash
function.  It outputs similar hashes for similar inputs, and does not use the
range of possible hash values uniformly.

It would be more appropriate and typical to use std::hash on each member and
then combine the hashes.  Doing a quick search I found that Boost offers
something exactly for that called hash_combine:

https://www.boost.org/doc/libs/1_55_0/doc/html/hash/combine.html

Unfortunately, there is nothing similar in the standard library.  But

A bit off-topic: intuitively, I would have thought that it would have been
sufficient to just add the hash values together.  Anyone knows why this
wouldn't be a good idea?

Simon


More information about the Gdb-patches mailing list