[Integration 2/6] RISC-V/extended: Add assembler and dis-assembler hooks for extended extensions.

Nelson Chu nelson.chu@sifive.com
Tue Mar 30 09:36:53 GMT 2021


To keep the original functions clean, we try to provide assembler
and dis-assembler hooks as enough as possible for extended extensions.
We probably need to add more once they are not enough in the future.
However, there are several changes that might need to be discussed
as follows,

* Change the type of enums to int directly, to extend them for
extended extensions.  Not sure if the change is good enough, but
it should be the easiler way to extend enums.

* The draft and vender operands should be parsed in the extended
hooks, validate_riscv_extended_insn and riscv_parse_extended_operands.
Obviously, we may need to reparse the opernad string in the extended
hooks, when the original functions cannot recognize them.  But the
original functions have already pointed the parsed poniter to the
next characters.  Therefore, we should use a new pointer, opargStart,
to record the position before parsing, and then pass it to the hooks
when we need to reparse the extended operands.

Another problem is that the current name rules of opcode operands may
not be enough when we will support the more drafts and vendors.  We
used to use 'C' prefix for RVC extensions, and tend to use 'V' prefix
for RVV extensions.  So the 'X' prefix should be the best choice for
vendor extensions, and maybe the second letter can be used to represent
the vendors.  But that means we could only support at most 24 vendors.
The better solution is just like what we did in the riscv_std_z_ext_strtab
- use string as their prefix, and record them in the array of string.
I havn't done this yet since we don't need the vendor operands for now.

* Part of the "internal: unknown" errors are reported in the extended
hooks rather than the original functions.  For example, we used to
report the "internal: unreachable" in the riscv_multi_subset_supports,
to tell developers that they forgot to handle the new defined INSN_CLASS_.
And the function returns TRUE/FALSE if the instruction is allowed or not
according to the architecture string.  The riscv_extended_subset_supports
is the extended hook of riscv_multi_subset_supports, so it also returns
a bfd_boolean to check the same thing.  But it is hard to know if the
INSN_CLASS_ is unknown from the same returned bfd_boolean, unless we add
another new flag, or we just move the error report to the hook directly.
I choose the latter for now, but it may cause the code of mainline and
integration branches are inconsistent, which may affect the difficulty
of the regular merge between these two branches.

The same inconsistent problem also happens in riscv_parse_extended_operands.
The hook only parse an operand rather than all, so it just has a switch
without a for loop.  We used to set "continue" to skip the loop in the
switch, but the extended hook doesn't need the "continue".  Perhaps we
should use a single while/for in the hooks to keep the code consistent,
then the regular merge may be more easiler.

* Rename the variables to the more meaningful names in the riscv_ip,
validate_riscv_insn and print_insn_args.
- oparg: Renamed from args, means the arguments in the opcode table.
- opargStart: Added to record the start of the argument.
- asarg: Renamed from s, means the arguments of assembly string.
- asargStart: Renamed from argsStart.

* Extract the part that parsing the instruction opcode from the
riscv_disassemble_insn, since we will need to call it for many times
to search multiple opcode tables.

gas/
    * config/tc-riscv.c (enum EXTENDED_EXT_NUM): Added to choose the
    right extended opcode hashes in the riscv_find_extended_opcode_hash.
    (enum riscv_csr_class): Added CSR_CLASS_EXTENDED.
    (enum reg_class): Added RCLASS_EXTENDED_NUM.
    (enum reg_extended_class): Added to define extended registers.
    (struct riscv_csr_extra): Changed enum riscv_csr_class to int,
    to increase the expandability of enum.
    (riscv_init_csr_hash): Likewise.
    (extended_ext_version_table): Record default versions for
    extended extensions.
    (extended_ext_version_hash): Handle of the extended extensions
    with version hash table.
    (riscv_search_ext_version_hash): Handle more than one
    ext_version_hash hashes.
    (riscv_find_opcode_hash): Handle more than one opcode hashes.
    (md_begin): Included riscv-opc-extended.h to define extended CSR.
    (init_ext_version_hash): Updated.
    (riscv_get_default_ext_version): Likewise.
    (md_assemble): Likewise.
    (s_riscv_insn): Likewsie.
    (riscv_after_parse_args): Likewise.
    (riscv_find_extended_opcode_hash): Extended hook for riscv_find_opcode_hash.
    (riscv_extended_subset_supports): Extended hook for
    riscv_multi_subset_supports.
    (riscv_extended_csr_class_check): Extended hook for riscv_csr_address,
    to check the CSR ISA dependency.
    (extended_macro): Extended hook for macro.
    (validate_riscv_extended_insn): Extended hook for validate_riscv_insn.
    (extended_macro_build): Extended hook for macro_build.
    (riscv_parse_extended_operands): Extended hook for riscv_ip.
    (riscv_multi_subset_supports): Updated to call extended hook.
    (riscv_csr_address): Likewise
    (macro): Likewise.
    (validate_riscv_insn): Likewise.  Also define new variables, xxx
    and xxxStart, in case single letters are not enough to represent
    all extended operands.
    (macro_build): Likewise.
    (riscv_ip): Likewise.  The asarg means assembly operand string,
    and oparg means operand string defined in the opcode table.
    * testsuite/gas/riscv/extended/extended.exp: New file to run
    extended testcases.
include/
    * opcode/riscv-opc-extended.h: New file to define encoding macros
    and CSR for extended extensions.
    * opcode/riscv.h: Included riscv-opc-extended.h.
    (enum riscv_insn_class): Added INSN_CLASS_EXTENDED.
    (struct riscv_opcode): Same as struct riscv_csr_extra.
    (enum M_EXTENDED): Added to support extended pseudo macros.
opcode/
    * riscv-dis.c (print_extended_insn_args): Extended hook for
    print_insn_args.
    (print_insn_args): Updated to call extended hook, and same as
    what validate_riscv_insn does.  Also include riscv-opc-extended.h
    to show extended CSR correctly.
    * riscv-opc.c (riscv_extended_opcodes): Added to store all
    supported draft and vendor instruction opcodes.
---
 gas/config/tc-riscv.c                         | 766 ++++++++++++++++----------
 gas/testsuite/gas/riscv/extended/extended.exp |  22 +
 include/opcode/riscv-opc-extended.h           |  23 +
 include/opcode/riscv.h                        |  12 +-
 opcodes/riscv-dis.c                           | 217 +++++---
 opcodes/riscv-opc.c                           |   8 +
 6 files changed, 677 insertions(+), 371 deletions(-)
 create mode 100644 gas/testsuite/gas/riscv/extended/extended.exp
 create mode 100644 include/opcode/riscv-opc-extended.h

diff --git a/gas/config/tc-riscv.c b/gas/config/tc-riscv.c
index 83cf157..01484f0 100644
--- a/gas/config/tc-riscv.c
+++ b/gas/config/tc-riscv.c
@@ -36,6 +36,13 @@
 
 #include <stdint.h>
 
+/* Used to choose the right opcode hashes for draft and vendor
+   extensions.  */
+enum
+{
+  EXTENDED_EXT_NUM = 0
+};
+
 /* Information about an instruction, including its format, operands
    and fixups.  */
 struct riscv_cl_insn
@@ -64,7 +71,8 @@ enum riscv_csr_class
   CSR_CLASS_I,
   CSR_CLASS_I_32, /* rv32 only */
   CSR_CLASS_F, /* f-ext only */
-  CSR_CLASS_DEBUG /* debug CSR */
+  CSR_CLASS_DEBUG, /* debug CSR */
+  CSR_CLASS_EXTENDED /* Extended CSR  */
 };
 
 /* This structure holds all restricted conditions for a CSR.  */
@@ -72,7 +80,7 @@ struct riscv_csr_extra
 {
   /* Class to which this CSR belongs.  Used to decide whether or
      not this CSR is legal in the current -march context.  */
-  enum riscv_csr_class csr_class;
+  int csr_class;
 
   /* CSR may have differnet numbers in the previous priv spec.  */
   unsigned address;
@@ -147,6 +155,13 @@ static const struct riscv_ext_version ext_version_table[] =
   {NULL, 0, 0, 0}
 };
 
+/* Default versions for draft and vendor extensions.  */
+static const struct riscv_ext_version extended_ext_version_table[] =
+{
+  /* Terminate the list.  */
+  {NULL, 0, 0, 0}
+};
+
 #ifndef DEFAULT_ARCH
 #define DEFAULT_ARCH "riscv64"
 #endif
@@ -307,8 +322,22 @@ riscv_subset_supports (const char *feature)
   return riscv_lookup_subset (&riscv_subsets, feature, &subset);
 }
 
+/* Check whether the draft and vendor extensions are enabled in
+   the architecture string.  */
+
 static bfd_boolean
-riscv_multi_subset_supports (enum riscv_insn_class insn_class)
+riscv_extended_subset_supports (int insn_class)
+{
+  switch (insn_class)
+    {
+    default:
+      as_fatal ("internal: unknown INSN_CLASS (0x%x)", insn_class);
+      return FALSE;
+    }
+}
+
+static bfd_boolean
+riscv_multi_subset_supports (int insn_class)
 {
   switch (insn_class)
     {
@@ -336,26 +365,24 @@ riscv_multi_subset_supports (enum riscv_insn_class insn_class)
 
     case INSN_CLASS_ZBB:
       return riscv_subset_supports ("zbb");
-
     case INSN_CLASS_ZBA:
       return riscv_subset_supports ("zba");
-
     case INSN_CLASS_ZBC:
       return riscv_subset_supports ("zbc");
 
     default:
-      as_fatal ("internal: unreachable");
-      return FALSE;
+      return riscv_extended_subset_supports (insn_class);
     }
 }
 
 /* Handle of the extension with version hash table.  */
 static htab_t ext_version_hash = NULL;
+/* Handle of the draft and vendor extensions with version hash table.  */
+static htab_t extended_ext_version_hash = NULL;
 
 static htab_t
-init_ext_version_hash (void)
+init_ext_version_hash (const struct riscv_ext_version *table)
 {
-  const struct riscv_ext_version *table = ext_version_table;
   htab_t hash = str_htab_create ();
   int i = 0;
 
@@ -374,17 +401,21 @@ init_ext_version_hash (void)
   return hash;
 }
 
-static void
-riscv_get_default_ext_version (const char *name,
+/* Find the suitable default version of extension from the
+   given version hashes.  */
+
+static bfd_boolean
+riscv_search_ext_version_hash (const char *name,
 			       int *major_version,
-			       int *minor_version)
+			       int *minor_version,
+			       htab_t table)
 {
   struct riscv_ext_version *ext;
 
-  if (name == NULL || default_isa_spec == ISA_SPEC_CLASS_NONE)
-    return;
+  if (table == NULL)
+    return FALSE;
 
-  ext = (struct riscv_ext_version *) str_hash_find (ext_version_hash, name);
+  ext = (struct riscv_ext_version *) str_hash_find (table, name);
   while (ext
 	 && ext->name
 	 && strcmp (ext->name, name) == 0)
@@ -394,10 +425,24 @@ riscv_get_default_ext_version (const char *name,
 	{
 	  *major_version = ext->major_version;
 	  *minor_version = ext->minor_version;
-	  return;
+	  return TRUE;
 	}
       ext++;
     }
+  return FALSE;
+}
+
+/* Get default version of extension according to the chosen ISA spec.  */
+
+static void
+riscv_get_default_ext_version (const char *name,
+			       int *major_version,
+			       int *minor_version)
+{
+  if (!riscv_search_ext_version_hash (name, major_version,
+				      minor_version, ext_version_hash))
+    riscv_search_ext_version_hash (name, major_version,
+				   minor_version, extended_ext_version_hash);
 }
 
 /* Set which ISA and extensions are available.  */
@@ -752,9 +797,15 @@ opcode_name_lookup (char **s)
   return o;
 }
 
+/* Draft and vendor registers.  */
+enum reg_extended_class
+{
+  RCLASS_EXTENDED_NUM,
+};
+
 enum reg_class
 {
-  RCLASS_GPR,
+  RCLASS_GPR = RCLASS_EXTENDED_NUM,
   RCLASS_FPR,
   RCLASS_MAX,
 
@@ -791,7 +842,7 @@ hash_reg_names (enum reg_class class, const char * const names[], unsigned n)
 static void
 riscv_init_csr_hash (const char *name,
 		     unsigned address,
-		     enum riscv_csr_class class,
+		     int class,
 		     enum riscv_spec_class define_version,
 		     enum riscv_spec_class abort_version)
 {
@@ -828,6 +879,19 @@ riscv_init_csr_hash (const char *name,
     pre_entry->next = entry;
 }
 
+/* Check the ISA dependency for extended CSR.  */
+
+static bfd_boolean
+riscv_extended_csr_class_check (int csr_class)
+{
+  switch (csr_class)
+    {
+    default:
+      as_bad (_("internal: bad RISC-V CSR class (0x%x)"), csr_class);
+    }
+  return FALSE;
+}
+
 /* Return the CSR address after checking the ISA dependency and
    the privileged spec version.
 
@@ -843,7 +907,7 @@ riscv_csr_address (const char *csr_name,
 		   struct riscv_csr_extra *entry)
 {
   struct riscv_csr_extra *saved_entry = entry;
-  enum riscv_csr_class csr_class = entry->csr_class;
+  int csr_class = entry->csr_class;
   bfd_boolean need_check_version = TRUE;
   bfd_boolean result = TRUE;
 
@@ -863,7 +927,8 @@ riscv_csr_address (const char *csr_name,
       need_check_version = FALSE;
       break;
     default:
-      as_bad (_("internal: bad RISC-V CSR class (0x%x)"), csr_class);
+      need_check_version = FALSE;
+      result = riscv_extended_csr_class_check (csr_class);
     }
 
   if (riscv_opts.csr_check && !result)
@@ -976,6 +1041,29 @@ arg_lookup (char **s, const char *const *array, size_t size, unsigned *regnop)
   return FALSE;
 }
 
+#define USE_BITS(mask,shift) (used_bits |= ((insn_t)(mask) << (shift)))
+
+/* Verify if the draft standard or vendor instructions are
+   valid or not.  */
+
+static bfd_boolean
+validate_riscv_extended_insn (insn_t *bits,
+			      const char **opcode_args)
+{
+  const char *oparg = *opcode_args;
+  insn_t used_bits = *bits;
+
+  switch (*oparg)
+    {
+    default:
+      return FALSE;
+    }
+
+  *opcode_args = oparg;
+  *bits = used_bits;
+  return TRUE;
+}
+
 /* For consistency checking, verify that all bits are specified either
    by the match/mask part of the instruction definition, or by the
    operand list. The `length` could be 0, 4 or 8, 0 for auto detection.  */
@@ -983,8 +1071,7 @@ arg_lookup (char **s, const char *const *array, size_t size, unsigned *regnop)
 static bfd_boolean
 validate_riscv_insn (const struct riscv_opcode *opc, int length)
 {
-  const char *p = opc->args;
-  char c;
+  const char *oparg, *opargStart;
   insn_t used_bits = opc->mask;
   int insn_width;
   insn_t required_bits;
@@ -1003,131 +1090,125 @@ validate_riscv_insn (const struct riscv_opcode *opc, int length)
       return FALSE;
     }
 
-#define USE_BITS(mask,shift)	(used_bits |= ((insn_t)(mask) << (shift)))
-  while (*p)
-    switch (c = *p++)
-      {
-      case 'C': /* RVC */
-	switch (c = *p++)
-	  {
-	  case 'U': break; /* CRS1, constrained to equal RD.  */
-	  case 'c': break; /* CRS1, constrained to equal sp.  */
-	  case 'T': /* CRS2, floating point.  */
-	  case 'V': USE_BITS (OP_MASK_CRS2, OP_SH_CRS2); break;
-	  case 'S': /* CRS1S, floating point.  */
-	  case 's': USE_BITS (OP_MASK_CRS1S, OP_SH_CRS1S); break;
-	  case 'w': break; /* CRS1S, constrained to equal RD.  */
-	  case 'D': /* CRS2S, floating point.  */
-	  case 't': USE_BITS (OP_MASK_CRS2S, OP_SH_CRS2S); break;
-	  case 'x': break; /* CRS2S, constrained to equal RD.  */
-	  case 'z': break; /* CRS2S, constrained to be x0.  */
-	  case '>': /* CITYPE immediate, compressed shift.  */
-	  case 'u': /* CITYPE immediate, compressed lui.  */
-	  case 'v': /* CITYPE immediate, li to compressed lui.  */
-	  case 'o': /* CITYPE immediate, allow zero.  */
-	  case 'j': used_bits |= ENCODE_CITYPE_IMM (-1U); break;
-	  case 'L': used_bits |= ENCODE_CITYPE_ADDI16SP_IMM (-1U); break;
-	  case 'm': used_bits |= ENCODE_CITYPE_LWSP_IMM (-1U); break;
-	  case 'n': used_bits |= ENCODE_CITYPE_LDSP_IMM (-1U); break;
-	  case '6': used_bits |= ENCODE_CSSTYPE_IMM (-1U); break;
-	  case 'M': used_bits |= ENCODE_CSSTYPE_SWSP_IMM (-1U); break;
-	  case 'N': used_bits |= ENCODE_CSSTYPE_SDSP_IMM (-1U); break;
-	  case '8': used_bits |= ENCODE_CIWTYPE_IMM (-1U); break;
-	  case 'K': used_bits |= ENCODE_CIWTYPE_ADDI4SPN_IMM (-1U); break;
-	  /* CLTYPE and CSTYPE have the same immediate encoding.  */
-	  case '5': used_bits |= ENCODE_CLTYPE_IMM (-1U); break;
-	  case 'k': used_bits |= ENCODE_CLTYPE_LW_IMM (-1U); break;
-	  case 'l': used_bits |= ENCODE_CLTYPE_LD_IMM (-1U); break;
-	  case 'p': used_bits |= ENCODE_CBTYPE_IMM (-1U); break;
-	  case 'a': used_bits |= ENCODE_CJTYPE_IMM (-1U); break;
-	  case 'F': /* Compressed funct for .insn directive.  */
-	    switch (c = *p++)
-	      {
-		case '6': USE_BITS (OP_MASK_CFUNCT6, OP_SH_CFUNCT6); break;
-		case '4': USE_BITS (OP_MASK_CFUNCT4, OP_SH_CFUNCT4); break;
-		case '3': USE_BITS (OP_MASK_CFUNCT3, OP_SH_CFUNCT3); break;
-		case '2': USE_BITS (OP_MASK_CFUNCT2, OP_SH_CFUNCT2); break;
-		default:
-		  as_bad (_("internal: bad RISC-V opcode "
-			    "(unknown operand type `CF%c'): %s %s"),
-			  c, opc->name, opc->args);
-		  return FALSE;
-	      }
-	    break;
-	  default:
-	    as_bad (_("internal: bad RISC-V opcode "
-		      "(unknown operand type `C%c'): %s %s"),
-		    c, opc->name, opc->args);
-	    return FALSE;
-	  }
-	break;
-      case ',': break;
-      case '(': break;
-      case ')': break;
-      case '<': USE_BITS (OP_MASK_SHAMTW, OP_SH_SHAMTW); break;
-      case '>': USE_BITS (OP_MASK_SHAMT, OP_SH_SHAMT); break;
-      case 'A': break; /* Macro operand, must be symbol.  */
-      case 'B': break; /* Macro operand, must be symbol or constant.  */
-      case 'I': break; /* Macro operand, must be constant.  */
-      case 'D': /* RD, floating point.  */
-      case 'd': USE_BITS (OP_MASK_RD, OP_SH_RD); break;
-      case 'Z': /* RS1, CSR number.  */
-      case 'S': /* RS1, floating point.  */
-      case 's': USE_BITS (OP_MASK_RS1, OP_SH_RS1); break;
-      case 'U': /* RS1 and RS2 are the same, floating point.  */
-	USE_BITS (OP_MASK_RS1, OP_SH_RS1);
-	/* Fall through.  */
-      case 'T': /* RS2, floating point.  */
-      case 't': USE_BITS (OP_MASK_RS2, OP_SH_RS2); break;
-      case 'R': /* RS3, floating point.  */
-      case 'r': USE_BITS (OP_MASK_RS3, OP_SH_RS3); break;
-      case 'm': USE_BITS (OP_MASK_RM, OP_SH_RM); break;
-      case 'E': USE_BITS (OP_MASK_CSR, OP_SH_CSR); break;
-      case 'P': USE_BITS (OP_MASK_PRED, OP_SH_PRED); break;
-      case 'Q': USE_BITS (OP_MASK_SUCC, OP_SH_SUCC); break;
-      case 'o': /* ITYPE immediate, load displacement.  */
-      case 'j': used_bits |= ENCODE_ITYPE_IMM (-1U); break;
-      case 'a': used_bits |= ENCODE_JTYPE_IMM (-1U); break;
-      case 'p': used_bits |= ENCODE_BTYPE_IMM (-1U); break;
-      case 'q': used_bits |= ENCODE_STYPE_IMM (-1U); break;
-      case 'u': used_bits |= ENCODE_UTYPE_IMM (-1U); break;
-      case 'z': break; /* Zero immediate.  */
-      case '[': break; /* Unused operand.  */
-      case ']': break; /* Unused operand.  */
-      case '0': break; /* AMO displacement, must to zero.  */
-      case '1': break; /* Relaxation operand.  */
-      case 'F': /* Funct for .insn directive.  */
-	switch (c = *p++)
-	  {
-	    case '7': USE_BITS (OP_MASK_FUNCT7, OP_SH_FUNCT7); break;
-	    case '3': USE_BITS (OP_MASK_FUNCT3, OP_SH_FUNCT3); break;
-	    case '2': USE_BITS (OP_MASK_FUNCT2, OP_SH_FUNCT2); break;
-	    default:
-	      as_bad (_("internal: bad RISC-V opcode "
-			"(unknown operand type `F%c'): %s %s"),
-		      c, opc->name, opc->args);
-	    return FALSE;
-	  }
-	break;
-      case 'O': /* Opcode for .insn directive.  */
-	switch (c = *p++)
-	  {
-	    case '4': USE_BITS (OP_MASK_OP, OP_SH_OP); break;
-	    case '2': USE_BITS (OP_MASK_OP2, OP_SH_OP2); break;
+  for (oparg = opc->args; *oparg; ++oparg)
+    {
+      opargStart = oparg;
+      switch (*oparg)
+	{
+	case 'C': /* RVC */
+	  switch (*++oparg)
+	    {
+	    case 'U': break; /* CRS1, constrained to equal RD.  */
+	    case 'c': break; /* CRS1, constrained to equal sp.  */
+	    case 'T': /* CRS2, floating point.  */
+	    case 'V': USE_BITS (OP_MASK_CRS2, OP_SH_CRS2); break;
+	    case 'S': /* CRS1S, floating point.  */
+	    case 's': USE_BITS (OP_MASK_CRS1S, OP_SH_CRS1S); break;
+	    case 'w': break; /* CRS1S, constrained to equal RD.  */
+	    case 'D': /* CRS2S, floating point.  */
+	    case 't': USE_BITS (OP_MASK_CRS2S, OP_SH_CRS2S); break;
+	    case 'x': break; /* CRS2S, constrained to equal RD.  */
+	    case 'z': break; /* CRS2S, constrained to be x0.  */
+	    case '>': /* CITYPE immediate, compressed shift.  */
+	    case 'u': /* CITYPE immediate, compressed lui.  */
+	    case 'v': /* CITYPE immediate, li to compressed lui.  */
+	    case 'o': /* CITYPE immediate, allow zero.  */
+	    case 'j': used_bits |= ENCODE_CITYPE_IMM (-1U); break;
+	    case 'L': used_bits |= ENCODE_CITYPE_ADDI16SP_IMM (-1U); break;
+	    case 'm': used_bits |= ENCODE_CITYPE_LWSP_IMM (-1U); break;
+	    case 'n': used_bits |= ENCODE_CITYPE_LDSP_IMM (-1U); break;
+	    case '6': used_bits |= ENCODE_CSSTYPE_IMM (-1U); break;
+	    case 'M': used_bits |= ENCODE_CSSTYPE_SWSP_IMM (-1U); break;
+	    case 'N': used_bits |= ENCODE_CSSTYPE_SDSP_IMM (-1U); break;
+	    case '8': used_bits |= ENCODE_CIWTYPE_IMM (-1U); break;
+	    case 'K': used_bits |= ENCODE_CIWTYPE_ADDI4SPN_IMM (-1U); break;
+	    /* CLTYPE and CSTYPE have the same immediate encoding.  */
+	    case '5': used_bits |= ENCODE_CLTYPE_IMM (-1U); break;
+	    case 'k': used_bits |= ENCODE_CLTYPE_LW_IMM (-1U); break;
+	    case 'l': used_bits |= ENCODE_CLTYPE_LD_IMM (-1U); break;
+	    case 'p': used_bits |= ENCODE_CBTYPE_IMM (-1U); break;
+	    case 'a': used_bits |= ENCODE_CJTYPE_IMM (-1U); break;
+	    case 'F': /* Compressed funct for .insn directive.  */
+	      switch (*++oparg)
+		{
+		  case '6': USE_BITS (OP_MASK_CFUNCT6, OP_SH_CFUNCT6); break;
+		  case '4': USE_BITS (OP_MASK_CFUNCT4, OP_SH_CFUNCT4); break;
+		  case '3': USE_BITS (OP_MASK_CFUNCT3, OP_SH_CFUNCT3); break;
+		  case '2': USE_BITS (OP_MASK_CFUNCT2, OP_SH_CFUNCT2); break;
+		  default:
+		    goto validate_extended_insn;
+		}
+	      break;
 	    default:
+	      goto validate_extended_insn;
+	    }
+	  break;
+	case ',': break;
+	case '(': break;
+	case ')': break;
+	case '<': USE_BITS (OP_MASK_SHAMTW, OP_SH_SHAMTW); break;
+	case '>': USE_BITS (OP_MASK_SHAMT, OP_SH_SHAMT); break;
+	case 'A': break; /* Macro operand, must be symbol.  */
+	case 'B': break; /* Macro operand, must be symbol or constant.  */
+	case 'I': break; /* Macro operand, must be constant.  */
+	case 'D': /* RD, floating point.  */
+	case 'd': USE_BITS (OP_MASK_RD, OP_SH_RD); break;
+	case 'Z': /* RS1, CSR number.  */
+	case 'S': /* RS1, floating point.  */
+	case 's': USE_BITS (OP_MASK_RS1, OP_SH_RS1); break;
+	case 'U': /* RS1 and RS2 are the same, floating point.  */
+	  USE_BITS (OP_MASK_RS1, OP_SH_RS1);
+	  /* Fall through.  */
+	case 'T': /* RS2, floating point.  */
+	case 't': USE_BITS (OP_MASK_RS2, OP_SH_RS2); break;
+	case 'R': /* RS3, floating point.  */
+	case 'r': USE_BITS (OP_MASK_RS3, OP_SH_RS3); break;
+	case 'm': USE_BITS (OP_MASK_RM, OP_SH_RM); break;
+	case 'E': USE_BITS (OP_MASK_CSR, OP_SH_CSR); break;
+	case 'P': USE_BITS (OP_MASK_PRED, OP_SH_PRED); break;
+	case 'Q': USE_BITS (OP_MASK_SUCC, OP_SH_SUCC); break;
+	case 'o': /* ITYPE immediate, load displacement.  */
+	case 'j': used_bits |= ENCODE_ITYPE_IMM (-1U); break;
+	case 'a': used_bits |= ENCODE_JTYPE_IMM (-1U); break;
+	case 'p': used_bits |= ENCODE_BTYPE_IMM (-1U); break;
+	case 'q': used_bits |= ENCODE_STYPE_IMM (-1U); break;
+	case 'u': used_bits |= ENCODE_UTYPE_IMM (-1U); break;
+	case 'z': break; /* Zero immediate.  */
+	case '[': break; /* Unused operand.  */
+	case ']': break; /* Unused operand.  */
+	case '0': break; /* AMO displacement, must to zero.  */
+	case '1': break; /* Relaxation operand.  */
+	case 'F': /* Funct for .insn directive.  */
+	  switch (*++oparg)
+	    {
+	      case '7': USE_BITS (OP_MASK_FUNCT7, OP_SH_FUNCT7); break;
+	      case '3': USE_BITS (OP_MASK_FUNCT3, OP_SH_FUNCT3); break;
+	      case '2': USE_BITS (OP_MASK_FUNCT2, OP_SH_FUNCT2); break;
+	      default:
+		goto validate_extended_insn;
+	    }
+	  break;
+	case 'O': /* Opcode for .insn directive.  */
+	  switch (*++oparg)
+	    {
+	      case '4': USE_BITS (OP_MASK_OP, OP_SH_OP); break;
+	      case '2': USE_BITS (OP_MASK_OP2, OP_SH_OP2); break;
+	      default:
+		goto validate_extended_insn;
+	    }
+	  break;
+	default:
+	validate_extended_insn:
+	  oparg = opargStart;
+	  if (!validate_riscv_extended_insn (&used_bits, &oparg))
+	    {
 	      as_bad (_("internal: bad RISC-V opcode "
-			"(unknown operand type `F%c'): %s %s"),
-		      c, opc->name, opc->args);
-	     return FALSE;
-	  }
-	break;
-      default:
-	as_bad (_("internal: bad RISC-V opcode "
-		  "(unknown operand type `%c'): %s %s"),
-		c, opc->name, opc->args);
-	return FALSE;
-      }
-#undef USE_BITS
+			"(unknown operand type `%s'): %s %s"),
+		      opargStart, opc->name, opc->args);
+	      return FALSE;
+	    }
+	}
+    }
   if (used_bits != required_bits)
     {
       as_bad (_("internal: bad RISC-V opcode "
@@ -1139,6 +1220,8 @@ validate_riscv_insn (const struct riscv_opcode *opc, int length)
   return TRUE;
 }
 
+#undef USE_BITS
+
 struct percent_op_match
 {
   const char *str;
@@ -1212,6 +1295,7 @@ md_begin (void)
 #define DECLARE_CSR_ALIAS(name, num, class, define_version, abort_version) \
   DECLARE_CSR(name, num, class, define_version, abort_version);
 #include "opcode/riscv-opc.h"
+#include "opcode/riscv-opc-extended.h"
 #undef DECLARE_CSR
 
   opcode_names_hash = str_htab_create ();
@@ -1309,6 +1393,68 @@ append_insn (struct riscv_cl_insn *ip, expressionS *address_expr,
     }
 }
 
+/* Find the draft standard and vendor extensions from
+   the specific opcode hashes.  */
+
+static struct riscv_opcode *
+riscv_find_extended_opcode_hash (char *str ATTRIBUTE_UNUSED)
+{
+  struct riscv_opcode *insn = NULL;
+  unsigned int i;
+
+  for (i = 0; i < EXTENDED_EXT_NUM; i++)
+    {
+      if (insn != NULL)
+        break;
+
+      switch (i)
+	{
+	default:
+	  break;
+	}
+    }
+  return insn;
+}
+
+/* Find instruction opcode from the opcode hashes.  */
+
+static struct riscv_opcode *
+riscv_find_opcode_hash (char *str, bfd_boolean is_insn_directive)
+{
+  struct riscv_opcode *insn;
+
+  if (is_insn_directive)
+    insn = (struct riscv_opcode *) str_hash_find (insn_type_hash, str);
+  else
+    {
+      insn = (struct riscv_opcode *) str_hash_find (op_hash, str);
+      if (insn == NULL)
+	insn = riscv_find_extended_opcode_hash (str);
+    }
+  return insn;
+}
+
+/* Macro expansion for extended pseudo instruction.  */
+
+static bfd_boolean
+extended_macro_build (struct riscv_cl_insn* insn_p,
+		      const char **fmt_p,
+		      va_list args ATTRIBUTE_UNUSED)
+{
+  struct riscv_cl_insn insn = *insn_p;
+  const char *fmt = *fmt_p;
+
+  switch (*fmt)
+    {
+    default:
+      return FALSE;
+    }
+
+  *insn_p = insn;
+  *fmt_p = fmt;
+  return TRUE;
+}
+
 /* Build an instruction created by a macro expansion.  This is passed
    a pointer to the count of instructions created so far, an expression,
    the name of the instruction to build, an operand format string, and
@@ -1321,11 +1467,12 @@ macro_build (expressionS *ep, const char *name, const char *fmt, ...)
   struct riscv_cl_insn insn;
   bfd_reloc_code_real_type r;
   va_list args;
+  const char *fmtStart;
 
   va_start (args, fmt);
 
   r = BFD_RELOC_UNUSED;
-  mo = (struct riscv_opcode *) str_hash_find (op_hash, name);
+  mo = riscv_find_opcode_hash ((char *) name, FALSE);
   gas_assert (mo);
 
   /* Find a non-RVC variant of the instruction.  append_insn will compress
@@ -1335,9 +1482,10 @@ macro_build (expressionS *ep, const char *name, const char *fmt, ...)
   gas_assert (strcmp (name, mo->name) == 0);
 
   create_insn (&insn, mo);
-  for (;;)
+  for (;; ++fmt)
     {
-      switch (*fmt++)
+      fmtStart = fmt;
+      switch (*fmt)
 	{
 	case 'd':
 	  INSERT_OPERAND (RD, insn, va_arg (args, int));
@@ -1363,7 +1511,10 @@ macro_build (expressionS *ep, const char *name, const char *fmt, ...)
 	case ',':
 	  continue;
 	default:
-	  as_fatal (_("internal: invalid macro"));
+	  fmt = fmtStart;
+	  if (extended_macro_build (&insn, &fmt, args))
+	    continue;
+	  as_fatal (_("internal: invalid macro argument `%s'"), fmtStart);
 	}
       break;
     }
@@ -1551,6 +1702,22 @@ riscv_ext (int destreg, int srcreg, unsigned shift, bfd_boolean sign)
     }
 }
 
+/* Expand RISC-V extended assembly macros into one or more instructions.  */
+
+static bfd_boolean
+extended_macro (struct riscv_cl_insn *ip ATTRIBUTE_UNUSED,
+		int mask,
+		expressionS *imm_expr ATTRIBUTE_UNUSED,
+		bfd_reloc_code_real_type *imm_reloc ATTRIBUTE_UNUSED)
+{
+  switch (mask)
+    {
+    default:
+      return FALSE;
+    }
+  return TRUE;
+}
+
 /* Expand RISC-V assembly macros into one or more instructions.  */
 
 static void
@@ -1690,7 +1857,8 @@ macro (struct riscv_cl_insn *ip, expressionS *imm_expr,
       break;
 
     default:
-      as_bad (_("internal: macro %s not implemented"), ip->insn_mo->name);
+      if (!extended_macro (ip, mask, imm_expr, imm_reloc))
+	as_bad (_("internal: macro %s not implemented"), ip->insn_mo->name);
       break;
     }
 }
@@ -1953,21 +2121,44 @@ riscv_is_priv_insn (insn_t insn)
 	  || ((insn ^ MATCH_SFENCE_VM) & MASK_SFENCE_VM) == 0);
 }
 
+/* Parse all draft and vendor operands for riscv_ip.  */
+
+static bfd_boolean
+riscv_parse_extended_operands (struct riscv_cl_insn *ip ATTRIBUTE_UNUSED,
+			       expressionS *imm_expr ATTRIBUTE_UNUSED,
+			       bfd_reloc_code_real_type *imm_reloc ATTRIBUTE_UNUSED,
+			       const char **opcode_args,
+			       char **assembly_args)
+{
+  const char *oparg = *opcode_args;
+  char *asarg = *assembly_args;
+
+  switch (*oparg)
+    {
+    default:
+      as_fatal (_("internal: unknown argument type `%s'"),
+		*opcode_args);
+      return FALSE;
+    }
+
+  *opcode_args = oparg;
+  *assembly_args = asarg;
+  return TRUE;
+}
+
 /* This routine assembles an instruction into its binary format.  As a
    side effect, it sets the global variable imm_reloc to the type of
    relocation to do if one of the operands is an address expression.  */
 
 static const char *
 riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
-	  bfd_reloc_code_real_type *imm_reloc, htab_t hash)
+	  bfd_reloc_code_real_type *imm_reloc, bfd_boolean is_insn_directive)
 {
-  char *s;
-  const char *args;
-  char c = 0;
+  const char *oparg, *opargStart;
+  char *asarg, *asargStart;
+  char save_c = 0;
   struct riscv_opcode *insn;
-  char *argsStart;
   unsigned int regno;
-  char save_c = 0;
   int argnum;
   const struct percent_op_match *p;
   const char *error = "unrecognized opcode";
@@ -1976,17 +2167,17 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 
   /* Parse the name of the instruction.  Terminate the string if whitespace
      is found so that str_hash_find only sees the name part of the string.  */
-  for (s = str; *s != '\0'; ++s)
-    if (ISSPACE (*s))
+  for (asarg = str; *asarg!= '\0'; ++asarg)
+    if (ISSPACE (*asarg))
       {
-	save_c = *s;
-	*s++ = '\0';
+	save_c = *asarg;
+	*asarg++ = '\0';
 	break;
       }
 
-  insn = (struct riscv_opcode *) str_hash_find (hash, str);
+  insn = riscv_find_opcode_hash (str, is_insn_directive);
 
-  argsStart = s;
+  asargStart = asarg;
   for ( ; insn && insn->name && strcmp (insn->name, str) == 0; insn++)
     {
       if ((insn->xlen_requirement != 0) && (xlen != insn->xlen_requirement))
@@ -2002,10 +2193,11 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
       *imm_reloc = BFD_RELOC_UNUSED;
       p = percent_op_itype;
 
-      for (args = insn->args;; ++args)
+      for (oparg = insn->args;; ++oparg)
 	{
-	  s += strspn (s, " \t");
-	  switch (*args)
+	  opargStart = oparg;
+	  asarg += strspn (asarg, " \t");
+	  switch (*oparg)
 	    {
 	    case '\0': /* End of args.  */
 	      if (insn->pinfo != INSN_MACRO)
@@ -2032,12 +2224,12 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		      /* Restore the character in advance, since we want to
 			 report the detailed warning message here.  */
 		      if (save_c)
-			*(argsStart - 1) = save_c;
+			*(asargStart - 1) = save_c;
 		      as_warn (_("read-only CSR is written `%s'"), str);
 		      insn_with_csr = FALSE;
 		    }
 		}
-	      if (*s != '\0')
+	      if (*asarg != '\0')
 		break;
 	      /* Successful assembly.  */
 	      error = NULL;
@@ -2045,63 +2237,63 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 	      goto out;
 
 	    case 'C': /* RVC */
-	      switch (*++args)
+	      switch (*++oparg)
 		{
 		case 's': /* RS1 x8-x15.  */
-		  if (!reg_lookup (&s, RCLASS_GPR, &regno)
+		  if (!reg_lookup (&asarg, RCLASS_GPR, &regno)
 		      || !(regno >= 8 && regno <= 15))
 		    break;
 		  INSERT_OPERAND (CRS1S, *ip, regno % 8);
 		  continue;
 		case 'w': /* RS1 x8-x15, constrained to equal RD x8-x15.  */
-		  if (!reg_lookup (&s, RCLASS_GPR, &regno)
+		  if (!reg_lookup (&asarg, RCLASS_GPR, &regno)
 		      || EXTRACT_OPERAND (CRS1S, ip->insn_opcode) + 8 != regno)
 		    break;
 		  continue;
 		case 't': /* RS2 x8-x15.  */
-		  if (!reg_lookup (&s, RCLASS_GPR, &regno)
+		  if (!reg_lookup (&asarg, RCLASS_GPR, &regno)
 		      || !(regno >= 8 && regno <= 15))
 		    break;
 		  INSERT_OPERAND (CRS2S, *ip, regno % 8);
 		  continue;
 		case 'x': /* RS2 x8-x15, constrained to equal RD x8-x15.  */
-		  if (!reg_lookup (&s, RCLASS_GPR, &regno)
+		  if (!reg_lookup (&asarg, RCLASS_GPR, &regno)
 		      || EXTRACT_OPERAND (CRS2S, ip->insn_opcode) + 8 != regno)
 		    break;
 		  continue;
 		case 'U': /* RS1, constrained to equal RD.  */
-		  if (!reg_lookup (&s, RCLASS_GPR, &regno)
+		  if (!reg_lookup (&asarg, RCLASS_GPR, &regno)
 		      || EXTRACT_OPERAND (RD, ip->insn_opcode) != regno)
 		    break;
 		  continue;
 		case 'V': /* RS2 */
-		  if (!reg_lookup (&s, RCLASS_GPR, &regno))
+		  if (!reg_lookup (&asarg, RCLASS_GPR, &regno))
 		    break;
 		  INSERT_OPERAND (CRS2, *ip, regno);
 		  continue;
 		case 'c': /* RS1, constrained to equal sp.  */
-		  if (!reg_lookup (&s, RCLASS_GPR, &regno)
+		  if (!reg_lookup (&asarg, RCLASS_GPR, &regno)
 		      || regno != X_SP)
 		    break;
 		  continue;
 		case 'z': /* RS2, constrained to equal x0.  */
-		  if (!reg_lookup (&s, RCLASS_GPR, &regno)
+		  if (!reg_lookup (&asarg, RCLASS_GPR, &regno)
 		      || regno != 0)
 		    break;
 		  continue;
 		case '>':
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || imm_expr->X_add_number <= 0
 		      || imm_expr->X_add_number >= 64)
 		    break;
 		  ip->insn_opcode |= ENCODE_CITYPE_IMM (imm_expr->X_add_number);
 		rvc_imm_done:
-		  s = expr_end;
+		  asarg = expr_end;
 		  imm_expr->X_op = O_absent;
 		  continue;
 		case '5':
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || imm_expr->X_add_number < 0
 		      || imm_expr->X_add_number >= 32
@@ -2110,7 +2302,7 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		  ip->insn_opcode |= ENCODE_CLTYPE_IMM (imm_expr->X_add_number);
 		  goto rvc_imm_done;
 		case '6':
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || imm_expr->X_add_number < 0
 		      || imm_expr->X_add_number >= 64
@@ -2119,7 +2311,7 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		  ip->insn_opcode |= ENCODE_CSSTYPE_IMM (imm_expr->X_add_number);
 		  goto rvc_imm_done;
 		case '8':
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || imm_expr->X_add_number < 0
 		      || imm_expr->X_add_number >= 256
@@ -2128,7 +2320,7 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		  ip->insn_opcode |= ENCODE_CIWTYPE_IMM (imm_expr->X_add_number);
 		  goto rvc_imm_done;
 		case 'j':
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || imm_expr->X_add_number == 0
 		      || !VALID_CITYPE_IMM ((valueT) imm_expr->X_add_number))
@@ -2136,27 +2328,27 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		  ip->insn_opcode |= ENCODE_CITYPE_IMM (imm_expr->X_add_number);
 		  goto rvc_imm_done;
 		case 'k':
-		  if (riscv_handle_implicit_zero_offset (imm_expr, s))
+		  if (riscv_handle_implicit_zero_offset (imm_expr, asarg))
 		    continue;
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || !VALID_CLTYPE_LW_IMM ((valueT) imm_expr->X_add_number))
 		    break;
 		  ip->insn_opcode |= ENCODE_CLTYPE_LW_IMM (imm_expr->X_add_number);
 		  goto rvc_imm_done;
 		case 'l':
-		  if (riscv_handle_implicit_zero_offset (imm_expr, s))
+		  if (riscv_handle_implicit_zero_offset (imm_expr, asarg))
 		    continue;
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || !VALID_CLTYPE_LD_IMM ((valueT) imm_expr->X_add_number))
 		    break;
 		  ip->insn_opcode |= ENCODE_CLTYPE_LD_IMM (imm_expr->X_add_number);
 		  goto rvc_imm_done;
 		case 'm':
-		  if (riscv_handle_implicit_zero_offset (imm_expr, s))
+		  if (riscv_handle_implicit_zero_offset (imm_expr, asarg))
 		    continue;
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || !VALID_CITYPE_LWSP_IMM ((valueT) imm_expr->X_add_number))
 		    break;
@@ -2164,9 +2356,9 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		    ENCODE_CITYPE_LWSP_IMM (imm_expr->X_add_number);
 		  goto rvc_imm_done;
 		case 'n':
-		  if (riscv_handle_implicit_zero_offset (imm_expr, s))
+		  if (riscv_handle_implicit_zero_offset (imm_expr, asarg))
 		    continue;
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || !VALID_CITYPE_LDSP_IMM ((valueT) imm_expr->X_add_number))
 		    break;
@@ -2174,7 +2366,7 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		    ENCODE_CITYPE_LDSP_IMM (imm_expr->X_add_number);
 		  goto rvc_imm_done;
 		case 'o':
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      /* C.addiw, c.li, and c.andi allow zero immediate.
 			 C.addi allows zero immediate as hint.  Otherwise this
@@ -2184,7 +2376,7 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		  ip->insn_opcode |= ENCODE_CITYPE_IMM (imm_expr->X_add_number);
 		  goto rvc_imm_done;
 		case 'K':
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || imm_expr->X_add_number == 0
 		      || !VALID_CIWTYPE_ADDI4SPN_IMM ((valueT) imm_expr->X_add_number))
@@ -2193,7 +2385,7 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		    ENCODE_CIWTYPE_ADDI4SPN_IMM (imm_expr->X_add_number);
 		  goto rvc_imm_done;
 		case 'L':
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || !VALID_CITYPE_ADDI16SP_IMM ((valueT) imm_expr->X_add_number))
 		    break;
@@ -2201,9 +2393,9 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		    ENCODE_CITYPE_ADDI16SP_IMM (imm_expr->X_add_number);
 		  goto rvc_imm_done;
 		case 'M':
-		  if (riscv_handle_implicit_zero_offset (imm_expr, s))
+		  if (riscv_handle_implicit_zero_offset (imm_expr, asarg))
 		    continue;
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || !VALID_CSSTYPE_SWSP_IMM ((valueT) imm_expr->X_add_number))
 		    break;
@@ -2211,9 +2403,9 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		    ENCODE_CSSTYPE_SWSP_IMM (imm_expr->X_add_number);
 		  goto rvc_imm_done;
 		case 'N':
-		  if (riscv_handle_implicit_zero_offset (imm_expr, s))
+		  if (riscv_handle_implicit_zero_offset (imm_expr, asarg))
 		    continue;
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || !VALID_CSSTYPE_SDSP_IMM ((valueT) imm_expr->X_add_number))
 		    break;
@@ -2222,7 +2414,7 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		  goto rvc_imm_done;
 		case 'u':
 		  p = percent_op_utype;
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p))
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p))
 		    break;
 		rvc_lui:
 		  if (imm_expr->X_op != O_constant
@@ -2235,7 +2427,7 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		  ip->insn_opcode |= ENCODE_CITYPE_IMM (imm_expr->X_add_number);
 		  goto rvc_imm_done;
 		case 'v':
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || (imm_expr->X_add_number & (RISCV_IMM_REACH - 1))
 		      || ((int32_t)imm_expr->X_add_number
 			  != imm_expr->X_add_number))
@@ -2248,27 +2440,27 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		case 'a':
 		  goto jump;
 		case 'S': /* Floating-point RS1 x8-x15.  */
-		  if (!reg_lookup (&s, RCLASS_FPR, &regno)
+		  if (!reg_lookup (&asarg, RCLASS_FPR, &regno)
 		      || !(regno >= 8 && regno <= 15))
 		    break;
 		  INSERT_OPERAND (CRS1S, *ip, regno % 8);
 		  continue;
 		case 'D': /* Floating-point RS2 x8-x15.  */
-		  if (!reg_lookup (&s, RCLASS_FPR, &regno)
+		  if (!reg_lookup (&asarg, RCLASS_FPR, &regno)
 		      || !(regno >= 8 && regno <= 15))
 		    break;
 		  INSERT_OPERAND (CRS2S, *ip, regno % 8);
 		  continue;
 		case 'T': /* Floating-point RS2.  */
-		  if (!reg_lookup (&s, RCLASS_FPR, &regno))
+		  if (!reg_lookup (&asarg, RCLASS_FPR, &regno))
 		    break;
 		  INSERT_OPERAND (CRS2, *ip, regno);
 		  continue;
 		case 'F':
-		  switch (*++args)
+		  switch (*++oparg)
 		    {
 		      case '6':
-		        if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		        if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 			    || imm_expr->X_op != O_constant
 			    || imm_expr->X_add_number < 0
 			    || imm_expr->X_add_number >= 64)
@@ -2279,11 +2471,11 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 			  }
 			INSERT_OPERAND (CFUNCT6, *ip, imm_expr->X_add_number);
 			imm_expr->X_op = O_absent;
-			s = expr_end;
+			asarg = expr_end;
 			continue;
 
 		      case '4':
-		        if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		        if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 			    || imm_expr->X_op != O_constant
 			    || imm_expr->X_add_number < 0
 			    || imm_expr->X_add_number >= 16)
@@ -2294,11 +2486,11 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 			  }
 			INSERT_OPERAND (CFUNCT4, *ip, imm_expr->X_add_number);
 			imm_expr->X_op = O_absent;
-			s = expr_end;
+			asarg = expr_end;
 			continue;
 
 		      case '3':
-			if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+			if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 			    || imm_expr->X_op != O_constant
 			    || imm_expr->X_add_number < 0
 			    || imm_expr->X_add_number >= 8)
@@ -2309,11 +2501,11 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 			  }
 			INSERT_OPERAND (CFUNCT3, *ip, imm_expr->X_add_number);
 			imm_expr->X_op = O_absent;
-			s = expr_end;
+			asarg = expr_end;
 			continue;
 
 		      case '2':
-			if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+			if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 			    || imm_expr->X_op != O_constant
 			    || imm_expr->X_add_number < 0
 			    || imm_expr->X_add_number >= 4)
@@ -2324,89 +2516,88 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 			  }
 			INSERT_OPERAND (CFUNCT2, *ip, imm_expr->X_add_number);
 			imm_expr->X_op = O_absent;
-			s = expr_end;
+			asarg = expr_end;
 			continue;
 
 		      default:
-			as_bad (_("internal: unknown compressed funct "
-				  "field specifier `CF%c'"), *args);
+			goto parse_extended_operand;
 		    }
 		  break;
 
 		default:
-		  as_bad (_("internal: unknown compressed field "
-			    "specifier `C%c'"), *args);
+		  goto parse_extended_operand;
 		}
 	      break;
 
 	    case ',':
 	      ++argnum;
-	      if (*s++ == *args)
+	      if (*asarg++ == *oparg)
 		continue;
-	      s--;
+	      asarg--;
 	      break;
 
 	    case '(':
 	    case ')':
 	    case '[':
 	    case ']':
-	      if (*s++ == *args)
+	      if (*asarg++ == *oparg)
 		continue;
 	      break;
 
 	    case '<': /* Shift amount, 0 - 31.  */
-	      my_getExpression (imm_expr, s);
+	      my_getExpression (imm_expr, asarg);
 	      check_absolute_expr (ip, imm_expr, FALSE);
 	      if ((unsigned long) imm_expr->X_add_number > 31)
 		as_bad (_("improper shift amount (%lu)"),
 			(unsigned long) imm_expr->X_add_number);
 	      INSERT_OPERAND (SHAMTW, *ip, imm_expr->X_add_number);
 	      imm_expr->X_op = O_absent;
-	      s = expr_end;
+	      asarg = expr_end;
 	      continue;
 
 	    case '>': /* Shift amount, 0 - (XLEN-1).  */
-	      my_getExpression (imm_expr, s);
+	      my_getExpression (imm_expr, asarg);
 	      check_absolute_expr (ip, imm_expr, FALSE);
 	      if ((unsigned long) imm_expr->X_add_number >= xlen)
 		as_bad (_("improper shift amount (%lu)"),
 			(unsigned long) imm_expr->X_add_number);
 	      INSERT_OPERAND (SHAMT, *ip, imm_expr->X_add_number);
 	      imm_expr->X_op = O_absent;
-	      s = expr_end;
+	      asarg = expr_end;
 	      continue;
 
 	    case 'Z': /* CSRRxI immediate.  */
-	      my_getExpression (imm_expr, s);
+	      my_getExpression (imm_expr, asarg);
 	      check_absolute_expr (ip, imm_expr, FALSE);
 	      if ((unsigned long) imm_expr->X_add_number > 31)
 		as_bad (_("improper CSRxI immediate (%lu)"),
 			(unsigned long) imm_expr->X_add_number);
 	      INSERT_OPERAND (RS1, *ip, imm_expr->X_add_number);
 	      imm_expr->X_op = O_absent;
-	      s = expr_end;
+	      asarg = expr_end;
 	      continue;
 
 	    case 'E': /* Control register.  */
 	      insn_with_csr = TRUE;
 	      explicit_priv_attr = TRUE;
-	      if (reg_lookup (&s, RCLASS_CSR, &regno))
+	      if (reg_lookup (&asarg, RCLASS_CSR, &regno))
 		INSERT_OPERAND (CSR, *ip, regno);
 	      else
 		{
-		  my_getExpression (imm_expr, s);
+		  my_getExpression (imm_expr, asarg);
 		  check_absolute_expr (ip, imm_expr, TRUE);
 		  if ((unsigned long) imm_expr->X_add_number > 0xfff)
 		    as_bad (_("improper CSR address (%lu)"),
 			    (unsigned long) imm_expr->X_add_number);
 		  INSERT_OPERAND (CSR, *ip, imm_expr->X_add_number);
 		  imm_expr->X_op = O_absent;
-		  s = expr_end;
+		  asarg = expr_end;
 		}
 	      continue;
 
 	    case 'm': /* Rounding mode.  */
-	      if (arg_lookup (&s, riscv_rm, ARRAY_SIZE (riscv_rm), &regno))
+	      if (arg_lookup (&asarg, riscv_rm,
+			      ARRAY_SIZE (riscv_rm), &regno))
 		{
 		  INSERT_OPERAND (RM, *ip, regno);
 		  continue;
@@ -2415,10 +2606,10 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 
 	    case 'P':
 	    case 'Q': /* Fence predecessor/successor.  */
-	      if (arg_lookup (&s, riscv_pred_succ, ARRAY_SIZE (riscv_pred_succ),
-			      &regno))
+	      if (arg_lookup (&asarg, riscv_pred_succ,
+			      ARRAY_SIZE (riscv_pred_succ), &regno))
 		{
-		  if (*args == 'P')
+		  if (*oparg == 'P')
 		    INSERT_OPERAND (PRED, *ip, regno);
 		  else
 		    INSERT_OPERAND (SUCC, *ip, regno);
@@ -2430,11 +2621,11 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 	    case 's': /* Source register.  */
 	    case 't': /* Target register.  */
 	    case 'r': /* RS3 */
-	      if (reg_lookup (&s, RCLASS_GPR, &regno))
+	      if (reg_lookup (&asarg, RCLASS_GPR, &regno))
 		{
-		  c = *args;
-		  if (*s == ' ')
-		    ++s;
+		  char c = *oparg;
+		  if (*asarg == ' ')
+		    ++asarg;
 
 		  /* Now that we have assembled one operand, we use the args
 		     string to figure out where it goes in the instruction.  */
@@ -2462,11 +2653,12 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 	    case 'T': /* Floating point RS2.  */
 	    case 'U': /* Floating point RS1 and RS2.  */
 	    case 'R': /* Floating point RS3.  */
-	      if (reg_lookup (&s, RCLASS_FPR, &regno))
+	      if (reg_lookup (&asarg, RCLASS_FPR, &regno))
 		{
-		  c = *args;
-		  if (*s == ' ')
-		    ++s;
+		  char c = *oparg;
+		  if (*asarg == ' ')
+		    ++asarg;
+
 		  switch (c)
 		    {
 		    case 'D':
@@ -2490,33 +2682,33 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 	      break;
 
 	    case 'I':
-	      my_getExpression (imm_expr, s);
+	      my_getExpression (imm_expr, asarg);
 	      if (imm_expr->X_op != O_big
 		  && imm_expr->X_op != O_constant)
 		break;
 	      normalize_constant_expr (imm_expr);
-	      s = expr_end;
+	      asarg = expr_end;
 	      continue;
 
 	    case 'A':
-	      my_getExpression (imm_expr, s);
+	      my_getExpression (imm_expr, asarg);
 	      normalize_constant_expr (imm_expr);
 	      /* The 'A' format specifier must be a symbol.  */
 	      if (imm_expr->X_op != O_symbol)
 	        break;
 	      *imm_reloc = BFD_RELOC_32;
-	      s = expr_end;
+	      asarg = expr_end;
 	      continue;
 
 	    case 'B':
-	      my_getExpression (imm_expr, s);
+	      my_getExpression (imm_expr, asarg);
 	      normalize_constant_expr (imm_expr);
 	      /* The 'B' format specifier must be a symbol or a constant.  */
 	      if (imm_expr->X_op != O_symbol && imm_expr->X_op != O_constant)
 	        break;
 	      if (imm_expr->X_op == O_symbol)
 	        *imm_reloc = BFD_RELOC_32;
-	      s = expr_end;
+	      asarg = expr_end;
 	      continue;
 
 	    case 'j': /* Sign-extended immediate.  */
@@ -2540,35 +2732,35 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 	    case '0': /* AMO displacement, which must be zero.  */
 	      p = percent_op_null;
 	    load_store:
-	      if (riscv_handle_implicit_zero_offset (imm_expr, s))
+	      if (riscv_handle_implicit_zero_offset (imm_expr, asarg))
 		continue;
 	    alu_op:
 	      /* If this value won't fit into a 16 bit offset, then go
 		 find a macro that will generate the 32 bit offset
 		 code pattern.  */
-	      if (!my_getSmallExpression (imm_expr, imm_reloc, s, p))
+	      if (!my_getSmallExpression (imm_expr, imm_reloc, asarg, p))
 		{
 		  normalize_constant_expr (imm_expr);
 		  if (imm_expr->X_op != O_constant
-		      || (*args == '0' && imm_expr->X_add_number != 0)
-		      || (*args == '1')
+		      || (*oparg == '0' && imm_expr->X_add_number != 0)
+		      || (*oparg == '1')
 		      || imm_expr->X_add_number >= (signed)RISCV_IMM_REACH/2
 		      || imm_expr->X_add_number < -(signed)RISCV_IMM_REACH/2)
 		    break;
 		}
-	      s = expr_end;
+	      asarg = expr_end;
 	      continue;
 
 	    case 'p': /* PC-relative offset.  */
 	    branch:
 	      *imm_reloc = BFD_RELOC_12_PCREL;
-	      my_getExpression (imm_expr, s);
-	      s = expr_end;
+	      my_getExpression (imm_expr, asarg);
+	      asarg = expr_end;
 	      continue;
 
 	    case 'u': /* Upper 20 bits.  */
 	      p = percent_op_utype;
-	      if (!my_getSmallExpression (imm_expr, imm_reloc, s, p))
+	      if (!my_getSmallExpression (imm_expr, imm_reloc, asarg, p))
 		{
 		  if (imm_expr->X_op != O_constant)
 		    break;
@@ -2580,33 +2772,33 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		  *imm_reloc = BFD_RELOC_RISCV_HI20;
 		  imm_expr->X_add_number <<= RISCV_IMM_BITS;
 		}
-	      s = expr_end;
+	      asarg = expr_end;
 	      continue;
 
 	    case 'a': /* 20-bit PC-relative offset.  */
 	    jump:
-	      my_getExpression (imm_expr, s);
-	      s = expr_end;
+	      my_getExpression (imm_expr, asarg);
+	      asarg = expr_end;
 	      *imm_reloc = BFD_RELOC_RISCV_JMP;
 	      continue;
 
 	    case 'c':
-	      my_getExpression (imm_expr, s);
-	      s = expr_end;
-	      if (strcmp (s, "@plt") == 0)
+	      my_getExpression (imm_expr, asarg);
+	      asarg = expr_end;
+	      if (strcmp (asarg, "@plt") == 0)
 		{
 		  *imm_reloc = BFD_RELOC_RISCV_CALL_PLT;
-		  s += 4;
+		  asarg += 4;
 		}
 	      else
 		*imm_reloc = BFD_RELOC_RISCV_CALL;
 	      continue;
 
 	    case 'O':
-	      switch (*++args)
+	      switch (*++oparg)
 		{
 		case '4':
-		  if (my_getOpcodeExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getOpcodeExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || imm_expr->X_add_number < 0
 		      || imm_expr->X_add_number >= 128
@@ -2619,11 +2811,11 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		    }
 		  INSERT_OPERAND (OP, *ip, imm_expr->X_add_number);
 		  imm_expr->X_op = O_absent;
-		  s = expr_end;
+		  asarg = expr_end;
 		  continue;
 
 		case '2':
-		  if (my_getOpcodeExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getOpcodeExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || imm_expr->X_add_number < 0
 		      || imm_expr->X_add_number >= 3)
@@ -2634,20 +2826,19 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		    }
 		  INSERT_OPERAND (OP2, *ip, imm_expr->X_add_number);
 		  imm_expr->X_op = O_absent;
-		  s = expr_end;
+		  asarg = expr_end;
 		  continue;
 
 		default:
-		  as_bad (_("internal: unknown opcode field "
-			    "specifier `O%c'"), *args);
+		  goto parse_extended_operand;
 		}
 	      break;
 
 	    case 'F':
-	      switch (*++args)
+	      switch (*++oparg)
 		{
 		case '7':
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || imm_expr->X_add_number < 0
 		      || imm_expr->X_add_number >= 128)
@@ -2658,11 +2849,11 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		    }
 		  INSERT_OPERAND (FUNCT7, *ip, imm_expr->X_add_number);
 		  imm_expr->X_op = O_absent;
-		  s = expr_end;
+		  asarg = expr_end;
 		  continue;
 
 		case '3':
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || imm_expr->X_add_number < 0
 		      || imm_expr->X_add_number >= 8)
@@ -2673,11 +2864,11 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		    }
 		  INSERT_OPERAND (FUNCT3, *ip, imm_expr->X_add_number);
 		  imm_expr->X_op = O_absent;
-		  s = expr_end;
+		  asarg = expr_end;
 		  continue;
 
 		case '2':
-		  if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+		  if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		      || imm_expr->X_op != O_constant
 		      || imm_expr->X_add_number < 0
 		      || imm_expr->X_add_number >= 4)
@@ -2688,30 +2879,33 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
 		    }
 		  INSERT_OPERAND (FUNCT2, *ip, imm_expr->X_add_number);
 		  imm_expr->X_op = O_absent;
-		  s = expr_end;
+		  asarg = expr_end;
 		  continue;
 
 		default:
-		  as_bad (_("internal: unknown funct field "
-			    "specifier `F%c'\n"), *args);
+		  goto parse_extended_operand;
 		}
 	      break;
 
 	    case 'z':
-	      if (my_getSmallExpression (imm_expr, imm_reloc, s, p)
+	      if (my_getSmallExpression (imm_expr, imm_reloc, asarg, p)
 		  || imm_expr->X_op != O_constant
 		  || imm_expr->X_add_number != 0)
 		break;
-	      s = expr_end;
+	      asarg = expr_end;
 	      imm_expr->X_op = O_absent;
 	      continue;
 
 	    default:
-	      as_fatal (_("internal: unknown argument type `%c'"), *args);
+	    parse_extended_operand:
+	      oparg = opargStart;
+	      if (riscv_parse_extended_operands (ip, imm_expr, imm_reloc,
+						 &oparg, &asarg))
+		continue;
 	    }
 	  break;
 	}
-      s = argsStart;
+      asarg = asargStart;
       error = _("illegal operands");
       insn_with_csr = FALSE;
     }
@@ -2719,7 +2913,7 @@ riscv_ip (char *str, struct riscv_cl_insn *ip, expressionS *imm_expr,
  out:
   /* Restore the character we might have clobbered above.  */
   if (save_c)
-    *(argsStart - 1) = save_c;
+    *(asargStart - 1) = save_c;
 
   return error;
 }
@@ -2742,7 +2936,7 @@ md_assemble (char *str)
        return;
     }
 
-  const char *error = riscv_ip (str, &insn, &imm_expr, &imm_reloc, op_hash);
+  const char *error = riscv_ip (str, &insn, &imm_expr, &imm_reloc, FALSE);
 
   if (error)
     {
@@ -2918,7 +3112,10 @@ riscv_after_parse_args (void)
     default_arch_with_ext = xlen == 64 ? "rv64g" : "rv32g";
 
   /* Initialize the hash table for extensions with default version.  */
-  ext_version_hash = init_ext_version_hash ();
+  ext_version_hash = init_ext_version_hash (ext_version_table);
+  /* Draft and vendor settings.  */
+  extended_ext_version_hash =
+	init_ext_version_hash (extended_ext_version_table);
 
   /* Set default specs.  */
   if (default_isa_spec == ISA_SPEC_CLASS_NONE)
@@ -3706,8 +3903,7 @@ s_riscv_insn (int x ATTRIBUTE_UNUSED)
   save_c = *input_line_pointer;
   *input_line_pointer = '\0';
 
-  const char *error = riscv_ip (str, &insn, &imm_expr,
-				&imm_reloc, insn_type_hash);
+  const char *error = riscv_ip (str, &insn, &imm_expr, &imm_reloc, TRUE);
 
   if (error)
     {
diff --git a/gas/testsuite/gas/riscv/extended/extended.exp b/gas/testsuite/gas/riscv/extended/extended.exp
new file mode 100644
index 0000000..00812fe
--- /dev/null
+++ b/gas/testsuite/gas/riscv/extended/extended.exp
@@ -0,0 +1,22 @@
+# Expect script for RISC-V assembler tests.
+#   Copyright (C) 2021-2021 Free Software Foundation, Inc.
+#
+# This file is part of the GNU Binutils.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston,
+# MA 02110-1301, USA.
+
+if [istarget riscv*-*-*] {
+}
diff --git a/include/opcode/riscv-opc-extended.h b/include/opcode/riscv-opc-extended.h
new file mode 100644
index 0000000..8641c4c
--- /dev/null
+++ b/include/opcode/riscv-opc-extended.h
@@ -0,0 +1,23 @@
+/* riscv-opcextended.h.  RISC-V extended instruction opcode.
+   Copyright (C) 2021-2021 Free Software Foundation, Inc.
+   Contributed by Andrew Waterman
+
+   This file is part of GDB, GAS, and the GNU binutils.
+
+   GDB, GAS, and the GNU binutils are free software; you can redistribute
+   them and/or modify them under the terms of the GNU General Public
+   License as published by the Free Software Foundation; either version
+   3, or (at your option) any later version.
+
+   GDB, GAS, and the GNU binutils are distributed in the hope that they
+   will be useful, but WITHOUT ANY WARRANTY; without even the implied
+   warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See
+   the GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; see the file COPYING3. If not,
+   see <http://www.gnu.org/licenses/>.  */
+
+#ifndef RISCV_EXTENDED_ENCODING_H
+#define RISCV_EXTENDED_ENCODING_H
+#endif /* RISCV_EXTENDED_ENCODING_H */
diff --git a/include/opcode/riscv.h b/include/opcode/riscv.h
index fdf3df4..50ccf6f 100644
--- a/include/opcode/riscv.h
+++ b/include/opcode/riscv.h
@@ -22,6 +22,7 @@
 #define _RISCV_H_
 
 #include "riscv-opc.h"
+#include "riscv-opc-extended.h"
 #include <stdlib.h>
 #include <stdint.h>
 
@@ -303,7 +304,6 @@ static const char * const riscv_pred_succ[16] =
 enum riscv_insn_class
 {
   INSN_CLASS_NONE,
-
   INSN_CLASS_I,
   INSN_CLASS_C,
   INSN_CLASS_A,
@@ -319,6 +319,7 @@ enum riscv_insn_class
   INSN_CLASS_ZBA,
   INSN_CLASS_ZBB,
   INSN_CLASS_ZBC,
+  INSN_CLASS_EXTENDED
 };
 
 /* This structure holds information for a particular instruction.  */
@@ -332,7 +333,7 @@ struct riscv_opcode
 
   /* Class to which this instruction belongs.  Used to decide whether or
      not this instruction is legal in the current -march context.  */
-  enum riscv_insn_class insn_class;
+  int insn_class;
 
   /* A string describing the arguments for this instruction.  */
   const char *args;
@@ -422,10 +423,9 @@ enum
   M_ZEXTW,
   M_SEXTB,
   M_SEXTH,
-  M_NUM_MACROS
+  M_EXTENDED
 };
 
-
 extern const char * const riscv_gpr_names_numeric[NGPR];
 extern const char * const riscv_gpr_names_abi[NGPR];
 extern const char * const riscv_fpr_names_numeric[NFPR];
@@ -434,4 +434,8 @@ extern const char * const riscv_fpr_names_abi[NFPR];
 extern const struct riscv_opcode riscv_opcodes[];
 extern const struct riscv_opcode riscv_insn_types[];
 
+/* Draft and vendor extensions.  */
+
+extern const struct riscv_opcode *riscv_extended_opcodes[];
+
 #endif /* _RISCV_H_ */
diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
index cc80d90..6a918e3 100644
--- a/opcodes/riscv-dis.c
+++ b/opcodes/riscv-dis.c
@@ -164,25 +164,46 @@ maybe_print_address (struct riscv_private_data *pd, int base_reg, int offset)
     pd->print_addr = offset;
 }
 
+/* Print insn arguments for draft and vendor extensions.  */
+
+static bfd_boolean
+print_extended_insn_args (const char **opcode_args,
+			  insn_t l ATTRIBUTE_UNUSED,
+			  disassemble_info *info ATTRIBUTE_UNUSED)
+{
+  const char *oparg = *opcode_args;
+
+  switch (*oparg)
+    {
+    default:
+      return FALSE;
+    }
+
+  *opcode_args = oparg;
+  return TRUE;
+}
+
 /* Print insn arguments for 32/64-bit code.  */
 
 static void
-print_insn_args (const char *d, insn_t l, bfd_vma pc, disassemble_info *info)
+print_insn_args (const char *oparg, insn_t l, bfd_vma pc, disassemble_info *info)
 {
   struct riscv_private_data *pd = info->private_data;
   int rs1 = (l >> OP_SH_RS1) & OP_MASK_RS1;
   int rd = (l >> OP_SH_RD) & OP_MASK_RD;
   fprintf_ftype print = info->fprintf_func;
+  const char *opargStart;
 
-  if (*d != '\0')
+  if (*oparg != '\0')
     print (info->stream, "\t");
 
-  for (; *d != '\0'; d++)
+  for (; *oparg != '\0'; oparg++)
     {
-      switch (*d)
+      opargStart = oparg;
+      switch (*oparg)
 	{
 	case 'C': /* RVC */
-	  switch (*++d)
+	  switch (*++oparg)
 	    {
 	    case 's': /* RS1 x8-x15.  */
 	    case 'w': /* RS1 x8-x15.  */
@@ -266,12 +287,12 @@ print_insn_args (const char *d, insn_t l, bfd_vma pc, disassemble_info *info)
 	case ')':
 	case '[':
 	case ']':
-	  print (info->stream, "%c", *d);
+	  print (info->stream, "%c", *oparg);
 	  break;
 
 	case '0':
 	  /* Only print constant 0 if it is the last argument.  */
-	  if (!d[1])
+	  if (!oparg[1])
 	    print (info->stream, "0");
 	  break;
 
@@ -397,6 +418,7 @@ print_insn_args (const char *d, insn_t l, bfd_vma pc, disassemble_info *info)
 #define DECLARE_CSR_ALIAS(name, num, class, define_version, abort_version) \
 		DECLARE_CSR (name, num, class, define_version, abort_version)
 #include "opcode/riscv-opc.h"
+#include "opcode/riscv-opc-extended.h"
 #undef DECLARE_CSR
 	      }
 
@@ -412,31 +434,34 @@ print_insn_args (const char *d, insn_t l, bfd_vma pc, disassemble_info *info)
 	  break;
 
 	default:
-	  /* xgettext:c-format */
-	  print (info->stream, _("# internal error, undefined modifier (%c)"),
-		 *d);
-	  return;
+	  oparg = opargStart;
+	  if (!print_extended_insn_args (&oparg, l, info))
+	    {
+	      /* xgettext:c-format */
+	      print (info->stream,
+		     _("# internal error, undefined modifier (%s)"), opargStart);
+	      return;
+	    }
 	}
     }
 }
 
-/* Print the RISC-V instruction at address MEMADDR in debugged memory,
-   on using INFO.  Returns length of the instruction, in bytes.
-   BIGENDIAN must be 1 if this is big-endian code, 0 if
-   this is little-endian code.  */
+/* Find the right opcode name whein disassembling.  */
 
-static int
-riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
+static const struct riscv_opcode *
+riscv_disassemble_opcode (insn_t word,
+			  disassemble_info *info)
 {
+  static const struct riscv_opcode *riscv_hash[OP_MASK_OP + 1];
   const struct riscv_opcode *op;
   static bfd_boolean init = 0;
-  static const struct riscv_opcode *riscv_hash[OP_MASK_OP + 1];
-  struct riscv_private_data *pd;
-  int insnlen;
+  unsigned xlen = 0;
+  unsigned int i;
 
 #define OP_HASH_IDX(i) ((i) & (riscv_insn_length (i) == 2 ? 0x3 : OP_MASK_OP))
 
-  /* Build a hash table to shorten the search time.  */
+  /* Build a hash table to shorten the search time.  For now we just build
+     the hash for the standard instructions.  */
   if (! init)
     {
       for (op = riscv_opcodes; op->name; op++)
@@ -446,6 +471,61 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
       init = 1;
     }
 
+  /* Search the standard instructions first.  */
+  op = riscv_hash[OP_HASH_IDX (word)];
+  i = 0;
+  do
+    {
+      /* The hash table entry might be NULL.  */
+      if (op != NULL)
+	{
+	  /* If XLEN is not known, get its value from the ELF class.  */
+	  if (info->mach == bfd_mach_riscv64)
+	    xlen = 64;
+	  else if (info->mach == bfd_mach_riscv32)
+	    xlen = 32;
+	  else if (info->section != NULL)
+	    {
+	      Elf_Internal_Ehdr *ehdr = elf_elfheader (info->section->owner);
+	      xlen = ehdr->e_ident[EI_CLASS] == ELFCLASS64 ? 64 : 32;
+	    }
+
+	  for (; op->name; op++)
+	    {
+	      /* Does the opcode match?  */
+	      if (! (op->match_func) (op, word))
+		continue;
+	      /* Is this a pseudo-instruction and may we print it as such?  */
+	      if (no_aliases && (op->pinfo & INSN_ALIAS))
+		continue;
+	      /* Is this instruction restricted to a certain value of XLEN?  */
+	      if ((op->xlen_requirement != 0) && (op->xlen_requirement != xlen))
+		continue;
+
+	      /* It's a match.  */
+	      return op;
+	    }
+	}
+      /* Keep searching draft and vendor opcode tables.  */
+      op = riscv_extended_opcodes[i++];
+    }
+  while (op != NULL);
+
+  return NULL;
+}
+
+/* Print the RISC-V instruction at address MEMADDR in debugged memory,
+   on using INFO.  Returns length of the instruction, in bytes.
+   BIGENDIAN must be 1 if this is big-endian code, 0 if
+   this is little-endian code.  */
+
+static int
+riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
+{
+  const struct riscv_opcode *op;
+  struct riscv_private_data *pd;
+  int insnlen;
+
   if (info->private_data == NULL)
     {
       int i;
@@ -479,75 +559,48 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
   info->target = 0;
   info->target2 = 0;
 
-  op = riscv_hash[OP_HASH_IDX (word)];
+  op = riscv_disassemble_opcode (word, info);
   if (op != NULL)
     {
-      unsigned xlen = 0;
-
-      /* If XLEN is not known, get its value from the ELF class.  */
-      if (info->mach == bfd_mach_riscv64)
-	xlen = 64;
-      else if (info->mach == bfd_mach_riscv32)
-	xlen = 32;
-      else if (info->section != NULL)
+      (*info->fprintf_func) (info->stream, "%s", op->name);
+      print_insn_args (op->args, word, memaddr, info);
+
+      /* Try to disassemble multi-instruction addressing sequences.  */
+      if (pd->print_addr != (bfd_vma)-1)
 	{
-	  Elf_Internal_Ehdr *ehdr = elf_elfheader (info->section->owner);
-	  xlen = ehdr->e_ident[EI_CLASS] == ELFCLASS64 ? 64 : 32;
+	  info->target = pd->print_addr;
+	  (*info->fprintf_func) (info->stream, " # ");
+	  (*info->print_address_func) (info->target, info);
+	  pd->print_addr = -1;
 	}
 
-      for (; op->name; op++)
+      /* Finish filling out insn_info fields.  */
+      switch (op->pinfo & INSN_TYPE)
 	{
-	  /* Does the opcode match?  */
-	  if (! (op->match_func) (op, word))
-	    continue;
-	  /* Is this a pseudo-instruction and may we print it as such?  */
-	  if (no_aliases && (op->pinfo & INSN_ALIAS))
-	    continue;
-	  /* Is this instruction restricted to a certain value of XLEN?  */
-	  if ((op->xlen_requirement != 0) && (op->xlen_requirement != xlen))
-	    continue;
-
-	  /* It's a match.  */
-	  (*info->fprintf_func) (info->stream, "%s", op->name);
-	  print_insn_args (op->args, word, memaddr, info);
-
-	  /* Try to disassemble multi-instruction addressing sequences.  */
-	  if (pd->print_addr != (bfd_vma)-1)
-	    {
-	      info->target = pd->print_addr;
-	      (*info->fprintf_func) (info->stream, " # ");
-	      (*info->print_address_func) (info->target, info);
-	      pd->print_addr = -1;
-	    }
-
-	  /* Finish filling out insn_info fields.  */
-	  switch (op->pinfo & INSN_TYPE)
-	    {
-	    case INSN_BRANCH:
-	      info->insn_type = dis_branch;
-	      break;
-	    case INSN_CONDBRANCH:
-	      info->insn_type = dis_condbranch;
-	      break;
-	    case INSN_JSR:
-	      info->insn_type = dis_jsr;
-	      break;
-	    case INSN_DREF:
-	      info->insn_type = dis_dref;
-	      break;
-	    default:
-	      break;
-	    }
-
-	  if (op->pinfo & INSN_DATA_SIZE)
-	    {
-	      int size = ((op->pinfo & INSN_DATA_SIZE)
-			  >> INSN_DATA_SIZE_SHIFT);
-	      info->data_size = 1 << (size - 1);
-	    }
+	case INSN_BRANCH:
+	  info->insn_type = dis_branch;
+	  break;
+	case INSN_CONDBRANCH:
+	  info->insn_type = dis_condbranch;
+	  break;
+	case INSN_JSR:
+	  info->insn_type = dis_jsr;
+	  break;
+	case INSN_DREF:
+	  info->insn_type = dis_dref;
+	  break;
+	default:
+	  break;
+	}
 
-	  return insnlen;
+      if (op->pinfo & INSN_DATA_SIZE)
+	{
+	  int size = ((op->pinfo & INSN_DATA_SIZE)
+		      >> INSN_DATA_SIZE_SHIFT);
+	  info->data_size = 1 << (size - 1);
 	}
+
+      return insnlen;
     }
 
   /* We did not find a match, so just print the instruction bits.  */
diff --git a/opcodes/riscv-opc.c b/opcodes/riscv-opc.c
index 1348ec7..6a077d0 100644
--- a/opcodes/riscv-opc.c
+++ b/opcodes/riscv-opc.c
@@ -941,3 +941,11 @@ const struct riscv_opcode riscv_insn_types[] =
 /* Terminate the list.  */
 {0, 0, INSN_CLASS_NONE, 0, 0, 0, 0, 0}
 };
+
+/* Draft and vendor extensions.  */
+
+/* The supported draft and vendor extensions.  */
+const struct riscv_opcode *riscv_extended_opcodes[] =
+{
+  NULL
+};
-- 
2.7.4



More information about the Binutils mailing list