This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] RISC-V: Implement programmatic arch parsing.


Hi Jim, Kito,

Thanks for reviewing my patches. I've written a different set of changes this
time, avoiding adding any new extensions, and just focusing on making parsing
more run-time programmable and selectable, add error checking, and add
some preliminary support for selectable parsing sets and rules (to handle
ISA spec parsing rule variations.)

> Other parts of the code aren't checking for duplicates.  Maybe we need
> to extend them to do this also?

I've modified the function `riscv_parse_sv_or_non_std_ext' to be able
to parse any prefix, based on flags and predicate functions defined in
the new structure parameter `riscv_parse_config_t'.

It now also knows about duplicate and alphabetical entries, but does
 the checking within the prefix class that it's parsing.

I've renamed the function `riscv_parse_sv_or_non_std_ext' to
`riscv_parse_arch_ext', to reflect it's general behaviour.

> Kito tells me that the LLVM folks are discussing this issue, and that
> we need to coordinate with them to ensure that the GNU and LLVM tools
> handle this the same way.  Hopefully Kito can do that.

Kito has opened a discussion thread to further discuss this here:
https://groups.google.com/a/groups.riscv.org/forum/?nomobile=true#!topic/sw-dev/aZhMG7NIVTk

In this patch I've also implemented some run-time selectable parsing
rules, based on the ISA version provided. This is implemented as follows:

A new structure is defined:

typedef struct riscv_isa_ver {
  bfd_boolean release_p;
  unsigned int major;
  unsigned int minor;
  unsigned int date;
} riscv_isa_ver_t;

The user can provide either a specific ISA version, encoded as
{TRUE, 2, 2, 0};
or a development, draft ISA version, encoded as
{FALSE, 0, 0, 20190702};

(the idea with draft ISA versions is probably not profitable, but I
threw it in just to see what people would think. )

A new function `riscv_find_spec_for_isa_ver' tries to find a
corresponding entry into `spectab', which is a table specifying a
parse order for a particular version. Currently it has one element:

static const riscv_isa_spec_t spectab[] =
  {
   {TRUE, 2, 2, 20170507, parse_order_0, ARRAY_SIZE (parse_order_0)}
  };

and in turn, `parse_order_0' is a pointer to a table which actually
specifies the extension parsing set and order:

static const riscv_parse_config_t parse_order_0[] =
  {
   {RV_ISA_CLASS_X, "x", "X", riscv_ext_x_valid_p},
   {RV_ISA_CLASS_S, "s", "S", riscv_ext_s_valid_p},
   {RV_ISA_CLASS_SX, "sx", "SX", riscv_ext_sx_valid_p},
   {RV_ISA_CLASS_UNKNOWN, NULL, NULL, NULL}
  };

The idea being, that in a future patch we can add another entry
that looks something like this:

static const riscv_parse_config_t parse_order_1[] =
  {
   {RV_ISA_CLASS_S, "s", "S", riscv_ext_s_valid_p},
   {RV_ISA_CLASS_SX, "sx", "SX", riscv_ext_sx_valid_p},
   {RV_ISA_CLASS_X, "x", "X", riscv_ext_x_valid_p},
   {RV_ISA_CLASS_Z, "z", "Z", riscv_ext_z_valid_p},
   {RV_ISA_CLASS_UNKNOWN, NULL, NULL, NULL}
  };

and the corresponding `spectab' table will look like

static const riscv_isa_spec_t spectab[] =
  {
   {TRUE, 2, 2, 20170507, parse_order_0, ARRAY_SIZE (parse_order_0)}
   {TRUE, 2, 3, yyyymmdd, parse_order_1, ARRAY_SIZE (parse_order_1)}
  };

But until then, we can define a "development" version, like so

static const riscv_isa_spec_t spectab[] =
  {
   {TRUE, 2, 2, 20170507, parse_order_0, ARRAY_SIZE (parse_order_0)}
   {FALSE, 0, 0, 20190702, parse_order_1, ARRAY_SIZE (parse_order_1)}
  };

...Finally with all the datastructures in place, The function
`riscv_parse_arch_ext' is called in order, for each entry in the
`parse_order_*' table (whichever is selected could be configured on
the command line.)

This way, adding different behaviour just means adding a new parse
order table, and a corresponding ISA spec version entry which points
to that new parse order table.

What do you think of this approach?  I can also send the patch that
changes the `riscv_parse_sv_or_non_std_ext` function separately, since
it isn't really involved that much with the rest of the patch.

Maxim

bfd/ChangeLog
2019-07-02  Maxim Blinov  <maxim.blinov@embecosm.com>
	* elfxx-riscv.c (riscv_isa_ext_class): New enum.
	(riscv_get_prefix_class): New function.
	(riscv_parse_config): New structure.
	(riscv_parse_arch_ext): Renamed from
	"riscv_parse_sv_or_non_std_ext".
	Add function parameter "config".
	Make error messages and parsing behaviour dependant on
	"config" parameter.
	Check extensions contain no duplicates.
	Check extensions are in alphabetical order.
	(riscv_std_s_ext_strtab): New string table.
	(riscv_ext_x_valid_p): New predicate.
	(riscv_ext_s_valid_p): New predicate.
	(riscv_ext_sx_valid_p): New predicate.
	(riscv_isa_spec): New structure.
	(riscv_parse_subset): Define parsing set and order at run-time.
	(riscv_find_spec_for_isa_ver): New function.
	(spectab): New table.

	* elfxx-riscv.h (riscv_isa_ver): New structure.

---
 bfd/elfxx-riscv.c | 282 ++++++++++++++++++++++++++++++++++++++++------
 bfd/elfxx-riscv.h |   7 ++
 2 files changed, 255 insertions(+), 34 deletions(-)

diff --git a/bfd/elfxx-riscv.c b/bfd/elfxx-riscv.c
index 245717f70f..25a5d33d62 100644
--- a/bfd/elfxx-riscv.c
+++ b/bfd/elfxx-riscv.c
@@ -1237,7 +1237,57 @@ riscv_parse_std_ext (riscv_parse_subset_t *rps,
   return p;
 }
 
-/* Parsing function for non-standard and supervisor extensions.
+typedef enum riscv_isa_ext_class
+  {
+   RV_ISA_CLASS_UNKNOWN,
+   RV_ISA_CLASS_X,
+   RV_ISA_CLASS_S,
+   RV_ISA_CLASS_SX,
+  } riscv_isa_ext_class_t;
+
+/* Deduce, from the input string, which class the given
+   architecture extension name belongs to.  */
+
+static riscv_isa_ext_class_t
+riscv_get_prefix_class (const char *arch)
+{
+  switch (*arch)
+    {
+    case 's':
+      if (strlen (arch) > 1 && arch[1] == 'x')
+	return RV_ISA_CLASS_SX;
+      else
+	return RV_ISA_CLASS_S;
+
+    case 'x': return RV_ISA_CLASS_X;
+    default: return RV_ISA_CLASS_UNKNOWN;
+    }
+}
+
+/* Structure used for describing how to parse architecture
+   directives other than the base "rvXXimafd...n" set.  */
+
+typedef struct riscv_parse_config
+{
+  /* Class of the extension. */
+  riscv_isa_ext_class_t class;
+
+  /* Lower-case prefix string for error printing
+     and internal parser usage, e.g. "s", "sx".  */
+  const char *prefix_lower;
+
+  /* Upper-case prefix string for error printing,
+     e.g. "SX". Used as "Error: SX extension ..."  */
+  const char *prefix_upper;
+
+  /* Predicate which is used for checking whether
+     this is a "known" extension. For 'x' and 'sx',
+     it always returns true (since they are by
+     definition non-standard and cannot be known.  */
+  int (*ext_valid_p) (const char *);
+} riscv_parse_config_t;
+
+/* Parsing function for arbitrary architecture extensions.
 
    Return Value:
      Points to the end of extensions.
@@ -1246,19 +1296,19 @@ riscv_parse_std_ext (riscv_parse_subset_t *rps,
      `rps`: Hooks and status for parsing subset.
      `march`: Full arch string.
      `p`: Curent parsing position.
-     `ext_type`: What kind of extensions, 'x', 's' or 'sx'.
-     `ext_type_str`: Full name for kind of extension.  */
+     `config`: Specifies parsing rules for this particular
+     parsing run.  */
 
 static const char *
-riscv_parse_sv_or_non_std_ext (riscv_parse_subset_t *rps,
-			       const char *march,
-			       const char *p,
-			       const char *ext_type,
-			       const char *ext_type_str)
+riscv_parse_arch_ext (riscv_parse_subset_t *rps,
+		      const char *march,
+		      const char *p,
+		      const riscv_parse_config_t *config)
 {
   unsigned major_version = 0;
   unsigned minor_version = 0;
-  size_t ext_type_len = strlen (ext_type);
+  const char *last_name;
+  riscv_isa_ext_class_t class;
 
   while (*p)
     {
@@ -1268,12 +1318,9 @@ riscv_parse_sv_or_non_std_ext (riscv_parse_subset_t *rps,
 	  continue;
 	}
 
-      if (strncmp (p, ext_type, ext_type_len) != 0)
-	break;
-
-      /* It's non-standard supervisor extension if it prefix with sx.  */
-      if ((ext_type[0] == 's') && (ext_type_len == 1)
-	  && (*(p + 1) == 'x'))
+      /* Assert that we're parsing the same class of extension.  */
+      class = riscv_get_prefix_class (p);
+      if (class != config->class)
 	break;
 
       char *subset = xstrdup (p);
@@ -1294,6 +1341,38 @@ riscv_parse_sv_or_non_std_ext (riscv_parse_subset_t *rps,
 
       *q = '\0';
 
+      /* Check that the extension name is well-formed.  */
+      if (!config->ext_valid_p (subset))
+	{
+	  rps->error_handler
+	    ("-march=%s: Invalid or unknown %s ISA extension: '%s'",
+	     march, config->prefix_upper, subset);
+	  free (subset);
+	  return NULL;
+	}
+
+      /* Check that the last item is not the same as this.  */
+      last_name = rps->subset_list->tail->name;
+
+      if (!strcasecmp (last_name, subset))
+	{
+	  rps->error_handler ("-march=%s: Duplicate %s ISA extension: \'%s\'",
+			      march, config->prefix_upper, subset);
+	  free (subset);
+	  return NULL;
+	}
+
+      /* Check that we are in alphabetical order within the subset.  */
+      if (!strncasecmp (last_name, config->prefix_lower, 1)
+	  && strcasecmp (last_name, subset) > 0)
+	{
+	  rps->error_handler ("-march=%s: %s ISA extension not in alphabetical order: "
+			      "\'%s\' must come before \'%s\'.",
+			      march, config->prefix_upper, subset, last_name);
+	  free (subset);
+	  return NULL;
+	}
+
       riscv_add_subset (rps->subset_list, subset, major_version, minor_version);
       free (subset);
       p += end_of_version - subset;
@@ -1301,7 +1380,7 @@ riscv_parse_sv_or_non_std_ext (riscv_parse_subset_t *rps,
       if (*p != '\0' && *p != '_')
 	{
 	  rps->error_handler ("-march=%s: %s must seperate with _",
-			      march, ext_type_str);
+			      march, config->prefix_lower);
 	  return NULL;
 	}
     }
@@ -1309,6 +1388,139 @@ riscv_parse_sv_or_non_std_ext (riscv_parse_subset_t *rps,
   return p;
 }
 
+/* Predicator function for x-prefixed extensions.
+   Anything goes, except the literal 'x'.  */
+
+static int
+riscv_ext_x_valid_p (const char *arg)
+{
+  if (!strcasecmp (arg, "x"))
+    return 0;
+
+  return 1;
+}
+
+/* Predicator function for 's' prefixed extensions.
+   Must be either literal 's', or a known s-prefixed extension.  */
+
+const char * const riscv_std_s_ext_strtab[] =
+  {
+  };
+
+static int
+riscv_ext_s_valid_p (const char *arg)
+{
+  const size_t tabsize = ARRAY_SIZE (riscv_std_s_ext_strtab);
+
+  if (strlen (arg) == 1 && *arg == 's')
+    return 1;
+
+  for (size_t i = 0; i < tabsize; ++i)
+    {
+      if (!strcasecmp (arg, riscv_std_s_ext_strtab[i]))
+	return 1;
+    }
+
+  return 0;
+}
+
+/* Predicator function for 'sx' prefixed extensions.
+   Anything goes, except the literal 'sx'.  */
+
+static int
+riscv_ext_sx_valid_p (const char *arg)
+{
+  if (!strcasecmp (arg, "sx"))
+    return 0;
+
+  return 1;
+}
+
+/* Standard parsing order that is currently used.  */
+
+static const riscv_parse_config_t parse_order_0[] =
+  {
+   {RV_ISA_CLASS_X, "x", "X", riscv_ext_x_valid_p},
+   {RV_ISA_CLASS_S, "s", "S", riscv_ext_s_valid_p},
+   {RV_ISA_CLASS_SX, "sx", "SX", riscv_ext_sx_valid_p},
+   {RV_ISA_CLASS_UNKNOWN, NULL, NULL, NULL}
+  };
+
+typedef struct riscv_isa_spec {
+  /* Is this version a release version?  */
+  int is_release_p;
+
+  /* If this is a release version, the provide
+     the major and minor release version, and refer to it.
+     For instance, `$ as -misa-version=2.2'.  */
+  unsigned int major_version;
+  unsigned int minor_version;
+
+  /* If this isn't a release version, then provide
+     the ISA draft date instead, and refer to it.
+     For instance, `$ as -misa-version=20190604'.  */
+  unsigned int date;
+
+  /* Pointer to a parsing order table, so we parse
+     flags on the command line in a manner that
+     corresponds to the ISA spec at that particular
+     point in time / release version.  */
+  const riscv_parse_config_t *parse_order;
+
+  /* sizeof the above table.  */
+  int parse_count;
+} riscv_isa_spec_t;
+
+/* Table specifying the different version variants, and
+   the corresponding parsing order that the entry corresponds to.
+   Keep this table sorted by date.  */
+
+static const riscv_isa_spec_t spectab[] =
+  {
+   {TRUE, 2, 2, 20170507, parse_order_0, ARRAY_SIZE (parse_order_0)}
+  };
+
+/* Try to find an entry into `spectab' that corresponds to
+   the given `ver' structure. If ver->release_p is true,
+   we only care about ISA spec version number. Otherwise, we only
+   care about ISA spec release date. `rps' is here for access
+   to the error handler.
+
+   If we failed to find one, return NULL.  */
+
+static const riscv_isa_spec_t *
+riscv_find_spec_for_isa_ver (const riscv_parse_subset_t *rps,
+			     const riscv_isa_ver_t *ver)
+{
+  size_t i;
+
+  if (ver->release_p)
+    {
+      /* Try to find a release version.  */
+      for (i = 0; i < ARRAY_SIZE (spectab); ++i)
+	{
+	  if (spectab[i].major_version == ver->major
+	      && spectab[i].minor_version == ver->minor)
+	    return &spectab[i];
+	}
+
+      rps->error_handler ("ISA version %i.%i is unknown.", ver->major, ver->minor);
+      return NULL;
+
+    }
+  else /* Not a release version, i.e. `-misa-ver=yyyymmdd`  */
+    {
+      for (i = 0; i < ARRAY_SIZE (spectab); ++i)
+	{
+	  if (spectab[i].date == ver->date)
+	    return &spectab[i];
+	}
+
+      rps->error_handler ("ISA draft version %i is unknown.", ver->date);
+      return NULL;
+    }
+}
+
 /* Function for parsing arch string.
 
    Return Value:
@@ -1322,6 +1534,18 @@ bfd_boolean
 riscv_parse_subset (riscv_parse_subset_t *rps,
 		    const char *arch)
 {
+  /* TODO: This can be a function argument in the future.  */
+  const riscv_isa_ver_t ver = {TRUE, 2, 2, 0};
+  /* If we wanted a non-standard version, we might do
+     const riscv_isa_ver_t ver = {FALSE, 0, 0, 20190702};
+  */
+
+  const riscv_isa_spec_t *spec = riscv_find_spec_for_isa_ver (rps, &ver);
+  const riscv_parse_config_t *config;
+
+  if (!spec)
+    return FALSE;
+
   const char *p = arch;
 
   if (strncmp (p, "rv32", 4) == 0)
@@ -1347,26 +1571,16 @@ riscv_parse_subset (riscv_parse_subset_t *rps,
   if (p == NULL)
     return FALSE;
 
-  /* Parsing non-standard extension.  */
-  p = riscv_parse_sv_or_non_std_ext (
-	rps, arch, p, "x", "non-standard extension");
+  /* Parsing all other extensions: The extensions and parsing order
+     is ISA spec-dependant, and so we do this programmatically here.  */
 
-  if (p == NULL)
-    return FALSE;
-
-  /* Parsing supervisor extension.  */
-  p = riscv_parse_sv_or_non_std_ext (
-	rps, arch, p, "s", "supervisor extension");
-
-  if (p == NULL)
-    return FALSE;
-
-  /* Parsing non-standard supervisor extension.  */
-  p = riscv_parse_sv_or_non_std_ext (
-	rps, arch, p, "sx", "non-standard supervisor extension");
+  for (int i = 0; i < spec->parse_count; ++i) {
+    config = &spec->parse_order[i];
+    p = riscv_parse_arch_ext (rps, arch, p, config);
 
-  if (p == NULL)
-    return FALSE;
+    if (p == NULL)
+      return FALSE;
+  }
 
   if (*p != '\0')
     {
diff --git a/bfd/elfxx-riscv.h b/bfd/elfxx-riscv.h
index 19f7bd2ecc..bea5224ecc 100644
--- a/bfd/elfxx-riscv.h
+++ b/bfd/elfxx-riscv.h
@@ -86,3 +86,10 @@ riscv_release_subset_list (riscv_subset_list_t *);
 
 extern char *
 riscv_arch_str (unsigned, const riscv_subset_list_t *);
+
+typedef struct riscv_isa_ver {
+  bfd_boolean release_p;
+  unsigned int major;
+  unsigned int minor;
+  unsigned int date;
+} riscv_isa_ver_t;
-- 
2.20.1


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]