This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] gas: Add --with-optimization=OPTION


On x86, some instructions have alternate shorter encodings:

1. When the upper 32 bits of destination registers of

andq $imm31, %r64
testq $imm31, %r64
xorq %r64, %r64
subq %r64, %r64

known to be zero, we can encode them without the REX_W bit:

andl $imm31, %r32
testl $imm31, %r32
xorl %r32, %r32
subl %r32, %r32

This optimization is enabled with -O, -O2 and -Os.
2. Since 0xb0 mov with 32-bit destination registers zero-extends 32-bit
immediate to 64-bit destination register, we can use it to encode 64-bit
mov with 32-bit immediates.  This optimization is enabled with -O, -O2
and -Os.
3. Since the upper bits of destination registers of VEX128 and EVEX128
instructions are extended to zero, if all bits of destination registers
of AVX256 or AVX512 instructions are zero, we can use VEX128 or EVEX128
encoding to encode AVX256 or AVX512 instructions.  When 2 source
registers are identical, AVX256 and AVX512 andn and xor instructions:

VOP %reg, %reg, %dest_reg

can be encoded with

VOP128 %reg, %reg, %dest_reg

This optimization is enabled with -O2 and -Os.
4. 16-bit, 32-bit and 64-bit register tests with immediate may be
encoded as 8-bit register test with immediate.  This optimization is
enabled with -Os.

This patch does:

1. Add --with-optimization=OPTION configure option for x86 gas.  This
allows the default optimization type to be adjusted at configure time.
2. Add {nooptimize} pseudo prefix to disable instruction size
optimization.
3. Add optimize to i386_opcode_modifier to tell assembler that encoding
of an instruction may be optimized.

Any comments?

H.J.
---
gas/

	* NEWS: Mention --with-optimization=OPTION.
	* configure.ac: Add --with-optimization=OPTION.
	(DEFAULT_OPTIMIZATION): New AC_DEFINE_UNQUOTED for
	--with-optimization=OPTION.
	* config.in: Regenerated.
	* configure: Likewise.
	* config/tc-i386.c (_i386_insn): Add no_optimize.
	(optimize): New.
	(optimize_for_space): Likewise.
	(fits_in_imm7): New function.
	(fits_in_imm31): Likewise.
	(optimize_encoding): Likewise.
	(md_begin): Set optimize and optimize_for_space if
	optimize_for_space < 0.
	(md_assemble): Call optimize_encoding to optimize encoding.
	(parse_insn): Handle {nooptimize}.
	(md_shortopts): Append "O::".
	(md_parse_option): Handle -On.
	* doc/c-i386.texi: Document -O0, -O, -O1, -O2 and -Os as well
	as {nooptimize}.
	* testsuite/gas/cfi/cfi-x86_64.d: Pass -O0 to assembler.
	* testsuite/gas/i386/ilp32/cfi/cfi-x86_64.d: Likewise.
	* testsuite/gas/i386/i386.exp: Run optimize-1, optimize-2,
	optimize-3, x86-64-optimize-1, x86-64-optimize-2,
	x86-64-optimize-3 and x86-64-optimize-4.
	* testsuite/gas/i386/optimize-1.d: New file.
	* testsuite/gas/i386/optimize-1.s: Likewise.
	* testsuite/gas/i386/optimize-2.d: Likewise.
	* testsuite/gas/i386/optimize-2.s: Likewise.
	* testsuite/gas/i386/optimize-3.d: Likewise.
	* testsuite/gas/i386/optimize-3.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-1.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-1.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-2.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-2.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-3.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-3.s: Likewise.
	* testsuite/gas/i386/x86-64-optimize-4.d: Likewise.
	* testsuite/gas/i386/x86-64-optimize-4.s: Likewise.

opcodes/

	* i386-gen.c (opcode_modifiers): Add Optimize.
	* i386-opc.h (Optimize): New enum.
	(i386_opcode_modifier): Add optimize.
	* i386-opc.tbl: Add "Optimize" to "mov $imm, reg",
	"sub reg, reg/mem", "test $imm, acc", "test $imm, reg/mem",
	"and $imm, acc", "and $imm, reg/mem", "xor reg, reg/mem",
	"movq $imm, reg" and AVX256 and AVX512 versions of vandnps,
	vandnpd, vpandn, vpandnd, vpandnq, vxorps, vxorpd, vpxor,
	vpxord and vpxorq.
	* i386-tbl.h: Regenerated.
---
 gas/NEWS                                      |     3 +
 gas/config.in                                 |     3 +
 gas/config/tc-i386.c                          |   270 +-
 gas/configure                                 |    28 +-
 gas/configure.ac                              |    20 +
 gas/doc/c-i386.texi                           |    28 +
 gas/testsuite/gas/cfi/cfi-x86_64.d            |     1 +
 gas/testsuite/gas/i386/i386.exp               |     7 +
 gas/testsuite/gas/i386/ilp32/cfi/cfi-x86_64.d |     1 +
 gas/testsuite/gas/i386/optimize-1.d           |    45 +
 gas/testsuite/gas/i386/optimize-1.s           |    48 +
 gas/testsuite/gas/i386/optimize-2.d           |    19 +
 gas/testsuite/gas/i386/optimize-2.s           |    13 +
 gas/testsuite/gas/i386/optimize-3.d           |    12 +
 gas/testsuite/gas/i386/optimize-3.s           |     6 +
 gas/testsuite/gas/i386/x86-64-optimize-1.d    |    53 +
 gas/testsuite/gas/i386/x86-64-optimize-1.s    |    47 +
 gas/testsuite/gas/i386/x86-64-optimize-2.d    |    77 +
 gas/testsuite/gas/i386/x86-64-optimize-2.s    |    80 +
 gas/testsuite/gas/i386/x86-64-optimize-3.d    |    27 +
 gas/testsuite/gas/i386/x86-64-optimize-3.s    |    21 +
 gas/testsuite/gas/i386/x86-64-optimize-4.d    |    12 +
 gas/testsuite/gas/i386/x86-64-optimize-4.s    |     6 +
 opcodes/i386-gen.c                            |     1 +
 opcodes/i386-opc.h                            |     4 +
 opcodes/i386-opc.tbl                          |    65 +-
 opcodes/i386-tbl.h                            | 10658 ++++++++++++------------
 27 files changed, 6197 insertions(+), 5358 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/optimize-1.d
 create mode 100644 gas/testsuite/gas/i386/optimize-1.s
 create mode 100644 gas/testsuite/gas/i386/optimize-2.d
 create mode 100644 gas/testsuite/gas/i386/optimize-2.s
 create mode 100644 gas/testsuite/gas/i386/optimize-3.d
 create mode 100644 gas/testsuite/gas/i386/optimize-3.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-1.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-1.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-2.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-2.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-3.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-3.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-4.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-4.s

diff --git a/gas/NEWS b/gas/NEWS
index 27ee306f2f..b7bde221c7 100644
--- a/gas/NEWS
+++ b/gas/NEWS
@@ -1,5 +1,8 @@
 -*- text -*-
 
+* Add --with-optimization=OPTION configure option for x86 gas.  This
+  allows the default optimization type to be adjusted at configure time.
+
 * Add support for .nop directive.  It is currently supported only for
   x86 targets.
 
diff --git a/gas/config.in b/gas/config.in
index 0855179696..d00895805d 100644
--- a/gas/config.in
+++ b/gas/config.in
@@ -46,6 +46,9 @@
 /* Define to 1 if you want to generate x86 relax relocations by default. */
 #undef DEFAULT_GENERATE_X86_RELAX_RELOCATIONS
 
+/* Define default optimization. */
+#undef DEFAULT_OPTIMIZATION
+
 /* Supported emulations. */
 #undef EMULATIONS
 
diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index dff42bdb91..8ea005acaf 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -372,6 +372,9 @@ struct _i386_insn
     /* Prefer the REX byte in encoding.  */
     bfd_boolean rex_encoding;
 
+    /* Disable instruction size optimization.  */
+    bfd_boolean no_optimize;
+
     /* How to encode vector instructions.  */
     enum
       {
@@ -600,6 +603,22 @@ static enum check_kind
   }
 sse_check, operand_check = check_warning;
 
+/* Optimization:
+   1. Clear the REX_W bit with register operand if possible.
+   2. Above plus use 128bit vector instruction to clear the full vector
+      register.
+ */
+static int optimize = 0;
+
+/* Optimization:
+   1. Clear the REX_W bit with register operand if possible.
+   2. Above plus use 128bit vector instruction to clear the full vector
+      register.
+   3. Above plus optimize "test{q,l,w} $imm8,%r{64,32,16}" to
+      "testb $imm7,%r8".
+ */
+static int optimize_for_space = -1;
+
 /* Register prefix used for error message.  */
 static const char *register_prefix = "%";
 
@@ -2185,6 +2204,18 @@ fits_in_imm4 (offsetT num)
   return (num & 0xf) == num;
 }
 
+static INLINE int
+fits_in_imm7 (offsetT num)
+{
+  return (num & 0x7f) == num;
+}
+
+static INLINE int
+fits_in_imm31 (offsetT num)
+{
+  return (num & 0x7fffffff) == num;
+}
+
 static i386_operand_type
 smallest_imm_type (offsetT num)
 {
@@ -2878,6 +2909,40 @@ md_begin (void)
       x86_dwarf2_return_column = 8;
       x86_cie_data_alignment = -4;
     }
+
+  if (optimize_for_space < 0)
+    {
+      optimize = 0;
+      optimize_for_space = 0;
+#ifdef DEFAULT_OPTIMIZATION
+      {
+	/* Set default optimization from DEFAULT_OPTIMIZATION.  */
+	const char *default_optimization = DEFAULT_OPTIMIZATION;
+	if (*default_optimization == '-')
+	  default_optimization++;
+	if (*default_optimization == 'O')
+	  {
+	    default_optimization++;
+	    if (*default_optimization == 's')
+	      {
+		optimize_for_space = 1;
+		/* Turn on all encoding optimizations.  */
+		optimize = -1;
+	      }
+	    else
+	      {
+		/* Turn off -Os.  */
+		optimize_for_space = 0;
+
+		if (*default_optimization != '\0')
+		  optimize = atoi (default_optimization);
+		else
+		  optimize = 0;
+	      }
+	  }
+      }
+#endif
+    }
 }
 
 void
@@ -3712,6 +3777,179 @@ check_hle (void)
     }
 }
 
+/* Try the shortest encoding by shortening operand size.  */
+
+static void
+optimize_encoding (void)
+{
+  int j;
+
+  if (optimize_for_space
+      && i.reg_operands == 1
+      && i.imm_operands == 1
+      && !i.types[1].bitfield.byte
+      && i.op[0].imms->X_op == O_constant
+      && fits_in_imm7 (i.op[0].imms->X_add_number)
+      && ((i.tm.base_opcode == 0xa8
+	   && i.tm.extension_opcode == None)
+	  || (i.tm.base_opcode == 0xf6
+	      && i.tm.extension_opcode == 0x0)))
+    {
+      /* Optimize: -Os:
+	   test $imm7, %r64/%r32/%r16  -> test $imm7, %r8
+       */
+      unsigned int base_regnum = i.op[1].regs->reg_num;
+      if (flag_code == CODE_64BIT || base_regnum < 4)
+	{
+	  i.types[1].bitfield.byte = 1;
+	  /* Ignore the suffix.  */
+	  i.suffix = 0;
+	  if (base_regnum >= 4
+	      && !(i.op[1].regs->reg_flags & RegRex))
+	    {
+	      /* Handle SP, BP, SI and DI registers.  */
+	      if (i.types[1].bitfield.word)
+		j = 16;
+	      else if (i.types[1].bitfield.dword)
+		j = 32;
+	      else
+		j = 48;
+	      i.op[1].regs -= j;
+	    }
+	}
+    }
+  else if (flag_code == CODE_64BIT
+	   && ((i.reg_operands == 1
+		&& i.imm_operands == 1
+		&& i.op[0].imms->X_op == O_constant
+		&& ((i.tm.base_opcode == 0xb0
+		     && i.tm.extension_opcode == None
+		     && fits_in_unsigned_long (i.op[0].imms->X_add_number))
+		    || (fits_in_imm31 (i.op[0].imms->X_add_number)
+			&& (((i.tm.base_opcode == 0x24
+			      || i.tm.base_opcode == 0xa8)
+			     && i.tm.extension_opcode == None)
+			    || (i.tm.base_opcode == 0x80
+				&& i.tm.extension_opcode == 0x4)
+			    || ((i.tm.base_opcode == 0xf6
+				 || i.tm.base_opcode == 0xc6)
+				&& i.tm.extension_opcode == 0x0)))))
+	       || (i.reg_operands == 2
+		   && i.op[0].regs == i.op[1].regs
+		   && ((i.tm.base_opcode == 0x30
+			|| i.tm.base_opcode == 0x28)
+		       && i.tm.extension_opcode == None)))
+	   && i.types[1].bitfield.qword)
+    {
+      /* Optimize: -O:
+	   andq $imm31, %r64   -> andl $imm31, %r32
+	   testq $imm31, %r64  -> testl $imm31, %r32
+	   xorq %r64, %r64     -> xorl %r32, %r32
+	   subq %r64, %r64     -> subl %r32, %r32
+	   movq $imm31, %r64   -> movl $imm31, %r32
+	   movq $imm32, %r64   -> movl $imm32, %r32
+        */
+      i.tm.opcode_modifier.norex64 = 1;
+      if (i.tm.base_opcode == 0xb0 || i.tm.base_opcode == 0xc6)
+	{
+	  /* Handle
+	       movq $imm31, %r64   -> movl $imm31, %r32
+	       movq $imm32, %r64   -> movl $imm32, %r32
+	   */
+	  i.tm.operand_types[0].bitfield.imm32 = 1;
+	  i.tm.operand_types[0].bitfield.imm32s = 0;
+	  i.tm.operand_types[0].bitfield.imm64 = 0;
+	  i.types[0].bitfield.imm32 = 1;
+	  i.types[0].bitfield.imm32s = 0;
+	  i.types[0].bitfield.imm64 = 0;
+	  i.types[1].bitfield.dword = 1;
+	  i.types[1].bitfield.qword = 0;
+	  if (i.tm.base_opcode == 0xc6)
+	    {
+	      /* Handle
+		   movq $imm31, %r64   -> movl $imm31, %r32
+	       */
+	      i.tm.base_opcode = 0xb0;
+	      i.tm.extension_opcode = None;
+	      i.tm.opcode_modifier.shortform = 1;
+	      i.tm.opcode_modifier.modrm = 0;
+	    }
+	}
+    }
+  else if (optimize > 1
+	   && i.reg_operands == 3
+	   && i.op[0].regs == i.op[1].regs
+	   && !i.types[2].bitfield.xmmword
+	   && (i.tm.opcode_modifier.vex
+	       || (!i.mask
+		   && !i.rounding
+		   && i.tm.opcode_modifier.evex
+		   && cpu_arch_flags.bitfield.cpuavx512vl))
+	   && ((i.tm.base_opcode == 0x55
+		|| i.tm.base_opcode == 0x6655
+		|| i.tm.base_opcode == 0x66df
+		|| i.tm.base_opcode == 0x57
+		|| i.tm.base_opcode == 0x6657
+		|| i.tm.base_opcode == 0x66ef)
+	       && i.tm.extension_opcode == None))
+    {
+      /* Optimize: -O2:
+	   VOP, one of vandnps, vandnpd, vxorps and vxorpd:
+	     EVEX VOP %zmmM, %zmmM, %zmmN
+	       -> VEX VOP %xmmM, %xmmM, %xmmN (M and N < 16)
+	       -> EVEX VOP %xmmM, %xmmM, %xmmN (M || N >= 16)
+	     EVEX VOP %ymmM, %ymmM, %ymmN
+	       -> VEX VOP %xmmM, %xmmM, %xmmN (M and N < 16)
+	       -> EVEX VOP %xmmM, %xmmM, %xmmN (M || N >= 16)
+	     VEX VOP %ymmM, %ymmM, %ymmN
+	       -> VEX VOP %xmmM, %xmmM, %xmmN
+	   VOP, one of vpandn and vpxor:
+	     VEX VOP %ymmM, %ymmM, %ymmN
+	       -> VEX VOP %xmmM, %xmmM, %xmmN
+	   VOP, one of vpandnd and vpandnq:
+	     EVEX VOP %zmmM, %zmmM, %zmmN
+	       -> VEX vpandn %xmmM, %xmmM, %xmmN (M and N < 16)
+	       -> EVEX VOP %xmmM, %xmmM, %xmmN (M || N >= 16)
+	     EVEX VOP %ymmM, %ymmM, %ymmN
+	       -> VEX vpandn %xmmM, %xmmM, %xmmN (M and N < 16)
+	       -> EVEX VOP %xmmM, %xmmM, %xmmN (M || N >= 16)
+	   VOP, one of vpxord and vpxorq:
+	     EVEX VOP %zmmM, %zmmM, %zmmN
+	       -> VEX vpxor %xmmM, %xmmM, %xmmN (M and N < 16)
+	       -> EVEX VOP %xmmM, %xmmM, %xmmN (M || N >= 16)
+	     EVEX vpandn %ymmM, %ymmM, %ymmN
+	       -> VEX vpxor %xmmM, %xmmM, %xmmN (M and N < 16)
+	       -> EVEX VOP %xmmM, %xmmM, %xmmN (M || N >= 16)
+       */
+      if (i.tm.opcode_modifier.evex)
+	{
+	  /* If only lower 16 vector registers are used, we can use
+	     VEX encoding.  */
+	  for (j = 0; j < 3; j++)
+	    if (register_number (i.op[j].regs) > 15)
+	      break;
+
+	  if (j < 3)
+	    i.tm.opcode_modifier.evex = EVEX128;
+	  else
+	    {
+	      i.tm.opcode_modifier.vex = VEX128;
+	      i.tm.opcode_modifier.vexw = VEXW0;
+	      i.tm.opcode_modifier.evex = 0;
+	    }
+	}
+      else
+	i.tm.opcode_modifier.vex = VEX128;
+
+      if (i.tm.opcode_modifier.vex)
+	for (j = 0; j < 3; j++)
+	  {
+	    i.types[j].bitfield.xmmword = 1;
+	    i.types[j].bitfield.ymmword = 0;
+	  }
+    }
+}
+
 /* This is the guts of the machine-dependent assembler.  LINE points to a
    machine dependent instruction.  This function is supposed to emit
    the frags/bytes it assembles to.  */
@@ -3877,6 +4115,9 @@ md_assemble (char *line)
       i.disp_operands = 0;
     }
 
+  if (optimize && !i.no_optimize && i.tm.opcode_modifier.optimize)
+    optimize_encoding ();
+
   if (!process_suffix ())
     return;
 
@@ -4131,6 +4372,10 @@ parse_insn (char *line, char *mnemonic)
 		  /* {rex} */
 		  i.rex_encoding = TRUE;
 		  break;
+		case 0x8:
+		  /* {nooptimize} */
+		  i.no_optimize = TRUE;
+		  break;
 		default:
 		  abort ();
 		}
@@ -10074,9 +10319,9 @@ md_operand (expressionS *e)
 
 
 #if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
-const char *md_shortopts = "kVQ:sqn";
+const char *md_shortopts = "kVQ:sqnO::";
 #else
-const char *md_shortopts = "qn";
+const char *md_shortopts = "qnO::";
 #endif
 
 #define OPTION_32 (OPTION_MD_BASE + 0)
@@ -10513,6 +10758,27 @@ md_parse_option (int c, const char *arg)
       intel64 = 1;
       break;
 
+    case 'O':
+      if (arg == NULL)
+	{
+	  optimize = 1;
+	  /* Turn off -Os.  */
+	  optimize_for_space = 0;
+	}
+      else if (*arg == 's')
+	{
+	  optimize_for_space = 1;
+	  /* Turn on all encoding optimizations.  */
+	  optimize = -1;
+	}
+      else
+	{
+	  optimize = atoi (arg);
+	  /* Turn off -Os.  */
+	  optimize_for_space = 0;
+	}
+      break;
+
     default:
       return 0;
     }
diff --git a/gas/configure b/gas/configure
index fbac8f44d5..ce9cf5f245 100755
--- a/gas/configure
+++ b/gas/configure
@@ -774,6 +774,7 @@ enable_elf_stt_common
 enable_werror
 enable_build_warnings
 with_cpu
+with_optimization
 enable_nls
 enable_maintainer_mode
 with_system_zlib
@@ -1440,6 +1441,9 @@ Optional Packages:
   --with-gnu-ld           assume the C compiler uses GNU ld [default=no]
   --with-cpu=CPU          default cpu variant is CPU (currently only supported
                           on ARC)
+  --with-optimization=OPTION
+                          default optimization is OPTION (currently only
+                          supported on x86)
   --with-system-zlib      use installed libz
 
 Some influential environment variables:
@@ -10987,7 +10991,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 10990 "configure"
+#line 10994 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -11093,7 +11097,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11096 "configure"
+#line 11100 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -11917,6 +11921,15 @@ _ACEOF
 fi
 
 
+# PR gas/22871
+# Provide a configure time option to override our default.
+ac_default_optimization=unset
+
+# Check whether --with-optimization was given.
+if test "${with_optimization+set}" = set; then :
+  withval=$with_optimization; ac_default_optimization=$with_optimization
+fi
+
 # PR 14072
 
 
@@ -12680,6 +12693,17 @@ cat >>confdefs.h <<_ACEOF
 _ACEOF
 
 
+if test ${ac_default_optimization} = unset; then
+  ac_default_optimization=no
+fi
+if test ${ac_default_optimization} != no; then
+
+cat >>confdefs.h <<_ACEOF
+#define DEFAULT_OPTIMIZATION "$ac_default_optimization"
+_ACEOF
+
+fi
+
 if test ${ac_default_elf_stt_common} = unset; then
   ac_default_elf_stt_common=0
 fi
diff --git a/gas/configure.ac b/gas/configure.ac
index 043b5c810b..f4f8de265b 100644
--- a/gas/configure.ac
+++ b/gas/configure.ac
@@ -118,6 +118,18 @@ AC_ARG_WITH(cpu,
                                 [Target specific CPU.])],
             [])
 
+# PR gas/22871
+dnl Option --with-optimization=OPTION allows configure-time control of the
+dnl default optimization type within the assembler.  Currently only the
+dnl x88 target supports this feature, but others may be added in the
+dnl future.
+# Provide a configure time option to override our default.
+ac_default_optimization=unset
+AC_ARG_WITH(optimization,
+	    AS_HELP_STRING([--with-optimization=OPTION],
+            [default optimization is OPTION (currently only supported on x86)]),
+ac_default_optimization=$with_optimization)dnl
+
 # PR 14072
 AH_VERBATIM([00_CONFIG_H_CHECK],
 [/* Check that config.h is #included before system headers
@@ -602,6 +614,14 @@ AC_DEFINE_UNQUOTED(DEFAULT_GENERATE_X86_RELAX_RELOCATIONS,
   $ac_default_x86_relax_relocations,
   [Define to 1 if you want to generate x86 relax relocations by default.])
 
+if test ${ac_default_optimization} = unset; then
+  ac_default_optimization=no
+fi
+if test ${ac_default_optimization} != no; then
+  AC_DEFINE_UNQUOTED(DEFAULT_OPTIMIZATION, "$ac_default_optimization",
+    [Define default optimization.])
+fi
+
 if test ${ac_default_elf_stt_common} = unset; then
   ac_default_elf_stt_common=0
 fi
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index 6b2def0457..de9e97543d 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -411,6 +411,31 @@ with 01, 10 and 11 RC bits, respectively.
 This option specifies that the assembler should accept only AMD64 or
 Intel64 ISA in 64-bit mode.  The default is to accept both.
 
+@cindex @samp{-O0} option, i386
+@cindex @samp{-O0} option, x86-64
+@cindex @samp{-O} option, i386
+@cindex @samp{-O} option, x86-64
+@cindex @samp{-O1} option, i386
+@cindex @samp{-O1} option, x86-64
+@cindex @samp{-O2} option, i386
+@cindex @samp{-O2} option, x86-64
+@cindex @samp{-Os} option, i386
+@cindex @samp{-Os} option, x86-64
+@item -O0 | -O | -O1 | -O2 | -Os
+Optimize instruction encoding with smaller instruction size.  @samp{-O}
+and @samp{-O1} encode 64-bit register load instructions with 64-bit
+immediate as 32-bit register load instructions with 31-bit or 32-bits
+immediates and encode 64-bit register clearing instructions with 32-bit
+register clearing instructions.  @samp{-O2} includes @samp{-O1}
+optimization plus encodes 256-bit and 512-bit vector register clearing
+instructions with 128-bit vector register clearing instructions.
+@samp{-Os} includes @samp{-O2} optimization plus encodes 16-bit, 32-bit
+and 64-bit register tests with immediate as 8-bit register test with
+immediate.  @samp{-O0} turns off this optimization.
+
+The default optimization can be controlled by configure option
+@option{--with-optimization=[-O|-O1|-O2|-Os]}.
+
 @end table
 @c man end
 
@@ -647,6 +672,9 @@ Different encoding options can be specified via pseudo prefixes:
 @samp{@{rex@}} -- prefer REX prefix for integer and legacy vector
 instructions (x86-64 only).  Note that this differs from the @samp{rex}
 prefix which generates REX prefix unconditionally.
+
+@item
+@samp{@{nooptimize@}} -- disable instruction size optimization.
 @end itemize
 
 @cindex conversion instructions, i386
diff --git a/gas/testsuite/gas/cfi/cfi-x86_64.d b/gas/testsuite/gas/cfi/cfi-x86_64.d
index 900a5e5248..ad25c35fc3 100644
--- a/gas/testsuite/gas/cfi/cfi-x86_64.d
+++ b/gas/testsuite/gas/cfi/cfi-x86_64.d
@@ -1,3 +1,4 @@
+#as: -O0
 #objdump: -Wf
 #name: CFI on x86-64
 #...
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index 0ca49ea46f..fc3c243582 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -433,6 +433,9 @@ if [expr ([istarget "i*86-*-*"] ||  [istarget "x86_64-*-*"]) && [gas_32_check]]
     run_list_test "inval-pseudo" "-al"
     run_dump_test "nop-1"
     run_dump_test "nop-2"
+    run_dump_test "optimize-1"
+    run_dump_test "optimize-2"
+    run_dump_test "optimize-3"
 
     # These tests require support for 8 and 16 bit relocs,
     # so we only run them for ELF and COFF targets.
@@ -913,6 +916,10 @@ if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_64_check]] t
     run_dump_test "x86-64-movd-intel"
     run_dump_test "x86-64-nop-1"
     run_dump_test "x86-64-nop-2"
+    run_dump_test "x86-64-optimize-1"
+    run_dump_test "x86-64-optimize-2"
+    run_dump_test "x86-64-optimize-3"
+    run_dump_test "x86-64-optimize-4"
 
     if { ![istarget "*-*-aix*"]
       && ![istarget "*-*-beos*"]
diff --git a/gas/testsuite/gas/i386/ilp32/cfi/cfi-x86_64.d b/gas/testsuite/gas/i386/ilp32/cfi/cfi-x86_64.d
index f2997b035a..c357286b90 100644
--- a/gas/testsuite/gas/i386/ilp32/cfi/cfi-x86_64.d
+++ b/gas/testsuite/gas/i386/ilp32/cfi/cfi-x86_64.d
@@ -1,4 +1,5 @@
 #source: ../../../cfi/cfi-x86_64.s
+#as: -O0
 #readelf: -wf
 #name: CFI on x86-64
 Contents of the .eh_frame section:
diff --git a/gas/testsuite/gas/i386/optimize-1.d b/gas/testsuite/gas/i386/optimize-1.d
new file mode 100644
index 0000000000..80b1e8325f
--- /dev/null
+++ b/gas/testsuite/gas/i386/optimize-1.d
@@ -0,0 +1,45 @@
+#as: -O2
+#objdump: -drw
+#name: optimized encoding 1 with -O2
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	62 f1 f5 4f 55 e9    	vandnpd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 af 55 e9    	vandnpd %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 f1 55 e9          	vandnpd %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	c5 f1 55 e9          	vandnpd %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 74 4f 55 e9    	vandnps %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 74 af 55 e9    	vandnps %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 f0 55 e9          	vandnps %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	c5 f0 55 e9          	vandnps %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	c5 f1 df e9          	vpandn %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 75 4f df e9    	vpandnd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 75 af df e9    	vpandnd %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 f1 df e9          	vpandn %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	c5 f1 df e9          	vpandn %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 f5 4f df e9    	vpandnq %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 af df e9    	vpandnq %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 f1 df e9          	vpandn %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	c5 f1 df e9          	vpandn %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 f5 4f 57 e9    	vxorpd %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 af 57 e9    	vxorpd %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 f1 57 e9          	vxorpd %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	c5 f1 57 e9          	vxorpd %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 74 4f 57 e9    	vxorps %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 74 af 57 e9    	vxorps %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 f0 57 e9          	vxorps %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	c5 f0 57 e9          	vxorps %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	c5 f1 ef e9          	vpxor  %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 75 4f ef e9    	vpxord %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 75 af ef e9    	vpxord %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 f1 ef e9          	vpxor  %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	c5 f1 ef e9          	vpxor  %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	62 f1 f5 4f ef e9    	vpxorq %zmm1,%zmm1,%zmm5\{%k7\}
+ +[a-f0-9]+:	62 f1 f5 af ef e9    	vpxorq %ymm1,%ymm1,%ymm5\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 f1 ef e9          	vpxor  %xmm1,%xmm1,%xmm5
+ +[a-f0-9]+:	c5 f1 ef e9          	vpxor  %xmm1,%xmm1,%xmm5
+#pass
diff --git a/gas/testsuite/gas/i386/optimize-1.s b/gas/testsuite/gas/i386/optimize-1.s
new file mode 100644
index 0000000000..042a0043cd
--- /dev/null
+++ b/gas/testsuite/gas/i386/optimize-1.s
@@ -0,0 +1,48 @@
+# Check instructions with optimized encoding
+
+	.allow_index_reg
+	.text
+_start:
+	vandnpd %zmm1, %zmm1, %zmm5{%k7}
+	vandnpd %ymm1, %ymm1, %ymm5{z}{%k7}
+	vandnpd %zmm1, %zmm1, %zmm5
+	vandnpd %ymm1, %ymm1, %ymm5
+
+	vandnps %zmm1, %zmm1, %zmm5{%k7}
+	vandnps %ymm1, %ymm1, %ymm5{z}{%k7}
+	vandnps %zmm1, %zmm1, %zmm5
+	vandnps %ymm1, %ymm1, %ymm5
+
+	vpandn %ymm1, %ymm1, %ymm5
+
+	vpandnd %zmm1, %zmm1, %zmm5{%k7}
+	vpandnd %ymm1, %ymm1, %ymm5{z}{%k7}
+	vpandnd %zmm1, %zmm1, %zmm5
+	vpandnd %ymm1, %ymm1, %ymm5
+
+	vpandnq %zmm1, %zmm1, %zmm5{%k7}
+	vpandnq %ymm1, %ymm1, %ymm5{z}{%k7}
+	vpandnq %zmm1, %zmm1, %zmm5
+	vpandnq %ymm1, %ymm1, %ymm5
+
+	vxorpd %zmm1, %zmm1, %zmm5{%k7}
+	vxorpd %ymm1, %ymm1, %ymm5{z}{%k7}
+	vxorpd %zmm1, %zmm1, %zmm5
+	vxorpd %ymm1, %ymm1, %ymm5
+
+	vxorps %zmm1, %zmm1, %zmm5{%k7}
+	vxorps %ymm1, %ymm1, %ymm5{z}{%k7}
+	vxorps %zmm1, %zmm1, %zmm5
+	vxorps %ymm1, %ymm1, %ymm5
+
+	vpxor %ymm1, %ymm1, %ymm5
+
+	vpxord %zmm1, %zmm1, %zmm5{%k7}
+	vpxord %ymm1, %ymm1, %ymm5{z}{%k7}
+	vpxord %zmm1, %zmm1, %zmm5
+	vpxord %ymm1, %ymm1, %ymm5
+
+	vpxorq %zmm1, %zmm1, %zmm5{%k7}
+	vpxorq %ymm1, %ymm1, %ymm5{z}{%k7}
+	vpxorq %zmm1, %zmm1, %zmm5
+	vpxorq %ymm1, %ymm1, %ymm5
diff --git a/gas/testsuite/gas/i386/optimize-2.d b/gas/testsuite/gas/i386/optimize-2.d
new file mode 100644
index 0000000000..ec989b0e13
--- /dev/null
+++ b/gas/testsuite/gas/i386/optimize-2.d
@@ -0,0 +1,19 @@
+#as: -Os
+#objdump: -drw
+#name: optimized encoding 2 with -Os
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	a8 7f                	test   \$0x7f,%al
+ +[a-f0-9]+:	a8 7f                	test   \$0x7f,%al
+ +[a-f0-9]+:	a8 7f                	test   \$0x7f,%al
+ +[a-f0-9]+:	f6 c3 7f             	test   \$0x7f,%bl
+ +[a-f0-9]+:	f6 c3 7f             	test   \$0x7f,%bl
+ +[a-f0-9]+:	f6 c3 7f             	test   \$0x7f,%bl
+ +[a-f0-9]+:	f7 c7 7f 00 00 00    	test   \$0x7f,%edi
+ +[a-f0-9]+:	66 f7 c7 7f 00       	test   \$0x7f,%di
+#pass
diff --git a/gas/testsuite/gas/i386/optimize-2.s b/gas/testsuite/gas/i386/optimize-2.s
new file mode 100644
index 0000000000..b427a741b9
--- /dev/null
+++ b/gas/testsuite/gas/i386/optimize-2.s
@@ -0,0 +1,13 @@
+# Check instructions with optimized encoding
+
+	.allow_index_reg
+	.text
+_start:
+	testl	$0x7f, %eax
+	testw	$0x7f, %ax
+	testb	$0x7f, %al
+	test	$0x7f, %ebx
+	test	$0x7f, %bx
+	test	$0x7f, %bl
+	test	$0x7f, %edi
+	test	$0x7f, %di
diff --git a/gas/testsuite/gas/i386/optimize-3.d b/gas/testsuite/gas/i386/optimize-3.d
new file mode 100644
index 0000000000..f251a3626d
--- /dev/null
+++ b/gas/testsuite/gas/i386/optimize-3.d
@@ -0,0 +1,12 @@
+#as: -Os
+#objdump: -drw
+#name: optimized encoding 3 with -Os
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	a9 7f 00 00 00       	test   \$0x7f,%eax
+#pass
diff --git a/gas/testsuite/gas/i386/optimize-3.s b/gas/testsuite/gas/i386/optimize-3.s
new file mode 100644
index 0000000000..536bf0cfb2
--- /dev/null
+++ b/gas/testsuite/gas/i386/optimize-3.s
@@ -0,0 +1,6 @@
+# Check instructions with optimized encoding
+
+	.allow_index_reg
+	.text
+_start:
+	{nooptimize} testl $0x7f, %eax
diff --git a/gas/testsuite/gas/i386/x86-64-optimize-1.d b/gas/testsuite/gas/i386/x86-64-optimize-1.d
new file mode 100644
index 0000000000..506d586063
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-optimize-1.d
@@ -0,0 +1,53 @@
+#as: -O
+#objdump: -drw
+#name: x86-64 optimized encoding 1 with -O
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	48 25 00 00 00 00    	and    \$0x0,%rax	2: R_X86_64_32S	foo
+ +[a-f0-9]+:	25 ff ff ff 7f       	and    \$0x7fffffff,%eax
+ +[a-f0-9]+:	81 e3 ff ff ff 7f    	and    \$0x7fffffff,%ebx
+ +[a-f0-9]+:	41 81 e6 ff ff ff 7f 	and    \$0x7fffffff,%r14d
+ +[a-f0-9]+:	48 25 00 00 00 80    	and    \$0xffffffff80000000,%rax
+ +[a-f0-9]+:	48 81 e3 00 00 00 80 	and    \$0xffffffff80000000,%rbx
+ +[a-f0-9]+:	49 81 e6 00 00 00 80 	and    \$0xffffffff80000000,%r14
+ +[a-f0-9]+:	a9 ff ff ff 7f       	test   \$0x7fffffff,%eax
+ +[a-f0-9]+:	f7 c3 ff ff ff 7f    	test   \$0x7fffffff,%ebx
+ +[a-f0-9]+:	41 f7 c6 ff ff ff 7f 	test   \$0x7fffffff,%r14d
+ +[a-f0-9]+:	48 a9 00 00 00 80    	test   \$0xffffffff80000000,%rax
+ +[a-f0-9]+:	48 f7 c3 00 00 00 80 	test   \$0xffffffff80000000,%rbx
+ +[a-f0-9]+:	49 f7 c6 00 00 00 80 	test   \$0xffffffff80000000,%r14
+ +[a-f0-9]+:	48 33 06             	xor    \(%rsi\),%rax
+ +[a-f0-9]+:	31 c0                	xor    %eax,%eax
+ +[a-f0-9]+:	31 db                	xor    %ebx,%ebx
+ +[a-f0-9]+:	45 31 f6             	xor    %r14d,%r14d
+ +[a-f0-9]+:	48 31 d0             	xor    %rdx,%rax
+ +[a-f0-9]+:	48 31 d3             	xor    %rdx,%rbx
+ +[a-f0-9]+:	49 31 d6             	xor    %rdx,%r14
+ +[a-f0-9]+:	29 c0                	sub    %eax,%eax
+ +[a-f0-9]+:	29 db                	sub    %ebx,%ebx
+ +[a-f0-9]+:	45 29 f6             	sub    %r14d,%r14d
+ +[a-f0-9]+:	48 29 d0             	sub    %rdx,%rax
+ +[a-f0-9]+:	48 29 d3             	sub    %rdx,%rbx
+ +[a-f0-9]+:	49 29 d6             	sub    %rdx,%r14
+ +[a-f0-9]+:	48 81 20 ff ff ff 7f 	andq   \$0x7fffffff,\(%rax\)
+ +[a-f0-9]+:	48 81 20 00 00 00 80 	andq   \$0xffffffff80000000,\(%rax\)
+ +[a-f0-9]+:	48 f7 00 ff ff ff 7f 	testq  \$0x7fffffff,\(%rax\)
+ +[a-f0-9]+:	48 f7 00 00 00 00 80 	testq  \$0xffffffff80000000,\(%rax\)
+ +[a-f0-9]+:	b8 ff ff ff 7f       	mov    \$0x7fffffff,%eax
+ +[a-f0-9]+:	b8 ff ff ff 7f       	mov    \$0x7fffffff,%eax
+ +[a-f0-9]+:	41 b8 ff ff ff 7f    	mov    \$0x7fffffff,%r8d
+ +[a-f0-9]+:	41 b8 ff ff ff 7f    	mov    \$0x7fffffff,%r8d
+ +[a-f0-9]+:	b8 ff ff ff ff       	mov    \$0xffffffff,%eax
+ +[a-f0-9]+:	b8 ff ff ff ff       	mov    \$0xffffffff,%eax
+ +[a-f0-9]+:	41 b8 ff ff ff ff    	mov    \$0xffffffff,%r8d
+ +[a-f0-9]+:	41 b8 ff ff ff ff    	mov    \$0xffffffff,%r8d
+ +[a-f0-9]+:	b8 ff 03 00 00       	mov    \$0x3ff,%eax
+ +[a-f0-9]+:	b8 ff 03 00 00       	mov    \$0x3ff,%eax
+ +[a-f0-9]+:	48 b8 00 00 00 00 01 00 00 00 	movabs \$0x100000000,%rax
+ +[a-f0-9]+:	48 b8 00 00 00 00 01 00 00 00 	movabs \$0x100000000,%rax
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-optimize-1.s b/gas/testsuite/gas/i386/x86-64-optimize-1.s
new file mode 100644
index 0000000000..2c6dc6d8ec
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-optimize-1.s
@@ -0,0 +1,47 @@
+# Check 64bit instructions with optimized encoding
+
+	.allow_index_reg
+	.text
+_start:
+	andq	$foo, %rax
+	andq	$((1<<31) - 1), %rax
+	andq	$((1<<31) - 1), %rbx
+	andq	$((1<<31) - 1), %r14
+	andq	$-((1<<31)), %rax
+	andq	$-((1<<31)), %rbx
+	andq	$-((1<<31)), %r14
+	testq	$((1<<31) - 1), %rax
+	testq	$((1<<31) - 1), %rbx
+	testq	$((1<<31) - 1), %r14
+	testq	$-((1<<31)), %rax
+	testq	$-((1<<31)), %rbx
+	testq	$-((1<<31)), %r14
+	xorq	(%rsi), %rax
+	xorq	%rax, %rax
+	xorq	%rbx, %rbx
+	xorq	%r14, %r14
+	xorq	%rdx, %rax
+	xorq	%rdx, %rbx
+	xorq	%rdx, %r14
+	subq	%rax, %rax
+	subq	%rbx, %rbx
+	subq	%r14, %r14
+	subq	%rdx, %rax
+	subq	%rdx, %rbx
+	subq	%rdx, %r14
+	andq	$((1<<31) - 1), (%rax)
+	andq	$-((1<<31)), (%rax)
+	testq	$((1<<31) - 1), (%rax)
+	testq	$-((1<<31)), (%rax)
+	mov	$((1<<31) - 1),%rax
+	movq	$((1<<31) - 1),%rax
+	mov	$((1<<31) - 1),%r8
+	movq	$((1<<31) - 1),%r8
+	mov	$0xffffffff,%rax
+	movq	$0xffffffff,%rax
+	mov	$0xffffffff,%r8
+	movq	$0xffffffff,%r8
+	mov	$1023,%rax
+	movq	$1023,%rax
+	mov	$0x100000000,%rax
+	movq	$0x100000000,%rax
diff --git a/gas/testsuite/gas/i386/x86-64-optimize-2.d b/gas/testsuite/gas/i386/x86-64-optimize-2.d
new file mode 100644
index 0000000000..f982f52ba7
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-optimize-2.d
@@ -0,0 +1,77 @@
+#as: -O2
+#objdump: -drw
+#name: x86-64 optimized encoding 2 with -O2
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	62 71 f5 4f 55 f9    	vandnpd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 af 55 f9    	vandnpd %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 71 55 f9          	vandnpd %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	c5 71 55 f9          	vandnpd %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 08 55 c1    	vandnpd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 e1 f5 08 55 c1    	vandnpd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 55 c9    	vandnpd %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 b1 f5 00 55 c9    	vandnpd %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 74 4f 55 f9    	vandnps %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 74 af 55 f9    	vandnps %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 70 55 f9          	vandnps %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	c5 70 55 f9          	vandnps %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 74 08 55 c1    	vandnps %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 e1 74 08 55 c1    	vandnps %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 74 00 55 c9    	vandnps %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 b1 74 00 55 c9    	vandnps %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 71 75 4f df f9    	vpandnd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 75 af df f9    	vpandnd %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 75 08 df c1    	vpandnd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 e1 75 08 df c1    	vpandnd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 75 00 df c9    	vpandnd %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 b1 75 00 df c9    	vpandnd %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 f5 4f df f9    	vpandnq %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 af df f9    	vpandnq %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	c5 71 df f9          	vpandn %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 08 df c1    	vpandnq %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 e1 f5 08 df c1    	vpandnq %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 df c9    	vpandnq %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 b1 f5 00 df c9    	vpandnq %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 f5 4f 57 f9    	vxorpd %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 af 57 f9    	vxorpd %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 71 57 f9          	vxorpd %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	c5 71 57 f9          	vxorpd %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 08 57 c1    	vxorpd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 e1 f5 08 57 c1    	vxorpd %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 57 c9    	vxorpd %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 b1 f5 00 57 c9    	vxorpd %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 74 4f 57 f9    	vxorps %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 74 af 57 f9    	vxorps %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 70 57 f9          	vxorps %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	c5 70 57 f9          	vxorps %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 74 08 57 c1    	vxorps %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 e1 74 08 57 c1    	vxorps %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 74 00 57 c9    	vxorps %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 b1 74 00 57 c9    	vxorps %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 71 75 4f ef f9    	vpxord %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 75 af ef f9    	vpxord %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 75 08 ef c1    	vpxord %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 e1 75 08 ef c1    	vpxord %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 75 00 ef c9    	vpxord %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 b1 75 00 ef c9    	vpxord %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 71 f5 4f ef f9    	vpxorq %zmm1,%zmm1,%zmm15\{%k7\}
+ +[a-f0-9]+:	62 71 f5 af ef f9    	vpxorq %ymm1,%ymm1,%ymm15\{%k7\}\{z\}
+ +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	c5 71 ef f9          	vpxor  %xmm1,%xmm1,%xmm15
+ +[a-f0-9]+:	62 e1 f5 08 ef c1    	vpxorq %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 e1 f5 08 ef c1    	vpxorq %xmm1,%xmm1,%xmm16
+ +[a-f0-9]+:	62 b1 f5 00 ef c9    	vpxorq %xmm17,%xmm17,%xmm1
+ +[a-f0-9]+:	62 b1 f5 00 ef c9    	vpxorq %xmm17,%xmm17,%xmm1
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-optimize-2.s b/gas/testsuite/gas/i386/x86-64-optimize-2.s
new file mode 100644
index 0000000000..6aa968b64e
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-optimize-2.s
@@ -0,0 +1,80 @@
+# Check 64bit instructions with optimized encoding
+
+	.allow_index_reg
+	.text
+_start:
+	vandnpd %zmm1, %zmm1, %zmm15{%k7}
+	vandnpd %ymm1, %ymm1, %ymm15{z}{%k7}
+	vandnpd %zmm1, %zmm1, %zmm15
+	vandnpd %ymm1, %ymm1, %ymm15
+	vandnpd %zmm1, %zmm1, %zmm16
+	vandnpd %ymm1, %ymm1, %ymm16
+	vandnpd %zmm17, %zmm17, %zmm1
+	vandnpd %ymm17, %ymm17, %ymm1
+
+	vandnps %zmm1, %zmm1, %zmm15{%k7}
+	vandnps %ymm1, %ymm1, %ymm15{z}{%k7}
+	vandnps %zmm1, %zmm1, %zmm15
+	vandnps %ymm1, %ymm1, %ymm15
+	vandnps %zmm1, %zmm1, %zmm16
+	vandnps %ymm1, %ymm1, %ymm16
+	vandnps %zmm17, %zmm17, %zmm1
+	vandnps %ymm17, %ymm17, %ymm1
+
+	vpandn %ymm1, %ymm1, %ymm15
+
+	vpandnd %zmm1, %zmm1, %zmm15{%k7}
+	vpandnd %ymm1, %ymm1, %ymm15{z}{%k7}
+	vpandnd %zmm1, %zmm1, %zmm15
+	vpandnd %ymm1, %ymm1, %ymm15
+	vpandnd %zmm1, %zmm1, %zmm16
+	vpandnd %ymm1, %ymm1, %ymm16
+	vpandnd %zmm17, %zmm17, %zmm1
+	vpandnd %ymm17, %ymm17, %ymm1
+
+	vpandnq %zmm1, %zmm1, %zmm15{%k7}
+	vpandnq %ymm1, %ymm1, %ymm15{z}{%k7}
+	vpandnq %zmm1, %zmm1, %zmm15
+	vpandnq %ymm1, %ymm1, %ymm15
+	vpandnq %zmm1, %zmm1, %zmm16
+	vpandnq %ymm1, %ymm1, %ymm16
+	vpandnq %zmm17, %zmm17, %zmm1
+	vpandnq %ymm17, %ymm17, %ymm1
+
+	vxorpd %zmm1, %zmm1, %zmm15{%k7}
+	vxorpd %ymm1, %ymm1, %ymm15{z}{%k7}
+	vxorpd %zmm1, %zmm1, %zmm15
+	vxorpd %ymm1, %ymm1, %ymm15
+	vxorpd %zmm1, %zmm1, %zmm16
+	vxorpd %ymm1, %ymm1, %ymm16
+	vxorpd %zmm17, %zmm17, %zmm1
+	vxorpd %ymm17, %ymm17, %ymm1
+
+	vxorps %zmm1, %zmm1, %zmm15{%k7}
+	vxorps %ymm1, %ymm1, %ymm15{z}{%k7}
+	vxorps %zmm1, %zmm1, %zmm15
+	vxorps %ymm1, %ymm1, %ymm15
+	vxorps %zmm1, %zmm1, %zmm16
+	vxorps %ymm1, %ymm1, %ymm16
+	vxorps %zmm17, %zmm17, %zmm1
+	vxorps %ymm17, %ymm17, %ymm1
+
+	vpxor %ymm1, %ymm1, %ymm15
+
+	vpxord %zmm1, %zmm1, %zmm15{%k7}
+	vpxord %ymm1, %ymm1, %ymm15{z}{%k7}
+	vpxord %zmm1, %zmm1, %zmm15
+	vpxord %ymm1, %ymm1, %ymm15
+	vpxord %zmm1, %zmm1, %zmm16
+	vpxord %ymm1, %ymm1, %ymm16
+	vpxord %zmm17, %zmm17, %zmm1
+	vpxord %ymm17, %ymm17, %ymm1
+
+	vpxorq %zmm1, %zmm1, %zmm15{%k7}
+	vpxorq %ymm1, %ymm1, %ymm15{z}{%k7}
+	vpxorq %zmm1, %zmm1, %zmm15
+	vpxorq %ymm1, %ymm1, %ymm15
+	vpxorq %zmm1, %zmm1, %zmm16
+	vpxorq %ymm1, %ymm1, %ymm16
+	vpxorq %zmm17, %zmm17, %zmm1
+	vpxorq %ymm17, %ymm17, %ymm1
diff --git a/gas/testsuite/gas/i386/x86-64-optimize-3.d b/gas/testsuite/gas/i386/x86-64-optimize-3.d
new file mode 100644
index 0000000000..b46f728dd8
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-optimize-3.d
@@ -0,0 +1,27 @@
+#as: -Os
+#objdump: -drw
+#name: x86-64 optimized encoding 3 with -Os
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	a8 7f                	test   \$0x7f,%al
+ +[a-f0-9]+:	a8 7f                	test   \$0x7f,%al
+ +[a-f0-9]+:	a8 7f                	test   \$0x7f,%al
+ +[a-f0-9]+:	a8 7f                	test   \$0x7f,%al
+ +[a-f0-9]+:	f6 c3 7f             	test   \$0x7f,%bl
+ +[a-f0-9]+:	f6 c3 7f             	test   \$0x7f,%bl
+ +[a-f0-9]+:	f6 c3 7f             	test   \$0x7f,%bl
+ +[a-f0-9]+:	f6 c3 7f             	test   \$0x7f,%bl
+ +[a-f0-9]+:	40 f6 c7 7f          	test   \$0x7f,%dil
+ +[a-f0-9]+:	40 f6 c7 7f          	test   \$0x7f,%dil
+ +[a-f0-9]+:	40 f6 c7 7f          	test   \$0x7f,%dil
+ +[a-f0-9]+:	40 f6 c7 7f          	test   \$0x7f,%dil
+ +[a-f0-9]+:	41 f6 c1 7f          	test   \$0x7f,%r9b
+ +[a-f0-9]+:	41 f6 c1 7f          	test   \$0x7f,%r9b
+ +[a-f0-9]+:	41 f6 c1 7f          	test   \$0x7f,%r9b
+ +[a-f0-9]+:	41 f6 c1 7f          	test   \$0x7f,%r9b
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-optimize-3.s b/gas/testsuite/gas/i386/x86-64-optimize-3.s
new file mode 100644
index 0000000000..61c150a87c
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-optimize-3.s
@@ -0,0 +1,21 @@
+# Check 64bit instructions with optimized encoding
+
+	.allow_index_reg
+	.text
+_start:
+	testq	$0x7f, %rax
+	testl	$0x7f, %eax
+	testw	$0x7f, %ax
+	testb	$0x7f, %al
+	test	$0x7f, %rbx
+	test	$0x7f, %ebx
+	test	$0x7f, %bx
+	test	$0x7f, %bl
+	test	$0x7f, %rdi
+	test	$0x7f, %edi
+	test	$0x7f, %di
+	test	$0x7f, %dil
+	test	$0x7f, %r9
+	test	$0x7f, %r9d
+	test	$0x7f, %r9w
+	test	$0x7f, %r9b
diff --git a/gas/testsuite/gas/i386/x86-64-optimize-4.d b/gas/testsuite/gas/i386/x86-64-optimize-4.d
new file mode 100644
index 0000000000..10e7b02d3a
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-optimize-4.d
@@ -0,0 +1,12 @@
+#as: -Os
+#objdump: -drw
+#name: x86-64 optimized encoding 4 with -Os
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+:	a9 7f 00 00 00       	test   \$0x7f,%eax
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-optimize-4.s b/gas/testsuite/gas/i386/x86-64-optimize-4.s
new file mode 100644
index 0000000000..0c4fdcecc5
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-optimize-4.s
@@ -0,0 +1,6 @@
+# Check 64bit instructions with optimized encoding
+
+	.allow_index_reg
+	.text
+_start:
+	{nooptimize} testl $0x7f, %eax
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 622d3e6639..669ac0c80e 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -646,6 +646,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (Disp8MemShift),
   BITFIELD (NoDefMask),
   BITFIELD (ImplicitQuadGroup),
+  BITFIELD (Optimize),
   BITFIELD (OldGcc),
   BITFIELD (ATTMnemonic),
   BITFIELD (ATTSyntax),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index 4af1d7deb7..ece3f47a6f 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -601,6 +601,9 @@ enum
    */
   ImplicitQuadGroup,
 
+  /* Support encoding optimization.  */
+  Optimize,
+
   /* Compatible with old (<= 2.8.1) versions of gcc  */
   OldGcc,
   /* AT&T mnemonic.  */
@@ -678,6 +681,7 @@ typedef struct i386_opcode_modifier
   unsigned int disp8memshift:3;
   unsigned int nodefmask:1;
   unsigned int implicitquadgroup:1;
+  unsigned int optimize:1;
   unsigned int oldgcc:1;
   unsigned int attmnemonic:1;
   unsigned int attsyntax:1;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index 450350556f..27ac376cbf 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -27,8 +27,8 @@ mov, 2, 0x88, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|HLEPrefixOk=3,
 // In the 64bit mode the short form mov immediate is redefined to have
 // 64bit value.
 mov, 2, 0xb0, None, 1, 0, W|ShortForm|No_sSuf|No_qSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32 }
-mov, 2, 0xc6, 0x0, 1, 0, W|Modrm|No_sSuf|No_ldSuf|HLEPrefixOk=3, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-mov, 2, 0xb0, None, 1, Cpu64, W|ShortForm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf, { Imm64, Reg64 }
+mov, 2, 0xc6, 0x0, 1, 0, W|Modrm|No_sSuf|No_ldSuf|HLEPrefixOk=3|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+mov, 2, 0xb0, None, 1, Cpu64, W|ShortForm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|Optimize, { Imm64, Reg64 }
 // The segment register moves accept WordReg so that a segment register
 // can be copied to a 32 bit register, and vice versa, without using a
 // size prefix.  When moving to a 32 bit register, the upper 16 bits
@@ -166,7 +166,7 @@ add, 2, 0x80, 0x0, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8
 inc, 1, 0x40, None, 1, CpuNo64, ShortForm|No_bSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg16|Reg32 }
 inc, 1, 0xfe, 0x0, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 
-sub, 2, 0x28, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+sub, 2, 0x28, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 sub, 2, 0x83, 0x5, 1, 0, Modrm|No_bSuf|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 sub, 2, 0x2c, None, 1, 0, W|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
 sub, 2, 0x80, 0x5, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
@@ -186,20 +186,20 @@ cmp, 2, 0x80, 0x7, 1, 0, W|Modrm|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Re
 
 test, 2, 0x84, None, 1, 0, W|CheckRegSize|Modrm|No_sSuf|No_ldSuf, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Unspecified|Byte|Word|Dword|Qword|BaseIndex }
 test, 2, 0x84, None, 1, 0, W|Modrm|No_sSuf|No_ldSuf, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
-test, 2, 0xa8, None, 1, 0, W|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
-test, 2, 0xf6, 0x0, 1, 0, W|Modrm|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+test, 2, 0xa8, None, 1, 0, W|No_sSuf|No_ldSuf|Optimize, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
+test, 2, 0xf6, 0x0, 1, 0, W|Modrm|No_sSuf|No_ldSuf|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 
 and, 2, 0x20, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 and, 2, 0x83, 0x4, 1, 0, Modrm|No_bSuf|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
-and, 2, 0x24, None, 1, 0, W|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
-and, 2, 0x80, 0x4, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+and, 2, 0x24, None, 1, 0, W|No_sSuf|No_ldSuf|Optimize, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
+and, 2, 0x80, 0x4, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 
 or, 2, 0x8, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 or, 2, 0x83, 0x1, 1, 0, Modrm|No_bSuf|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 or, 2, 0xc, None, 1, 0, W|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
 or, 2, 0x80, 0x1, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 
-xor, 2, 0x30, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+xor, 2, 0x30, None, 1, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 xor, 2, 0x83, 0x6, 1, 0, Modrm|No_bSuf|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 xor, 2, 0x34, None, 1, 0, W|No_sSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
 xor, 2, 0x80, 0x6, 1, 0, W|Modrm|No_sSuf|No_ldSuf|IsLockable|HLEPrefixOk, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
@@ -830,6 +830,7 @@ rex.wrxb, 0, 0x4f, None, 1, Cpu64, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ld
 {vex3}, 0, 0x5, None, 0, 0, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsPrefix, { 0 }
 {evex}, 0, 0x6, None, 0, 0, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsPrefix, { 0 }
 {rex}, 0, 0x7, None, 0, Cpu64, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsPrefix, { 0 }
+{nooptimize}, 0, 0x8, None, 0, 0, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsPrefix, { 0 }
 
 // 486 extensions.
 
@@ -964,8 +965,8 @@ movd, 2, 0xf7e, None, 2, CpuMMX|Cpu64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|
 // we only mark constants larger than 32bit as Disp64.
 movq, 2, 0xa0, None, 1, Cpu64, D|W|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Disp64|Unspecified|Qword, Acc|Qword }
 movq, 2, 0x88, None, 1, Cpu64, D|W|Modrm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|HLEPrefixOk=3, { Reg64, Reg64|Unspecified|Qword|BaseIndex }
-movq, 2, 0xc6, 0x0, 1, Cpu64, W|Modrm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|HLEPrefixOk=3, { Imm32S, Reg64|Qword|Unspecified|BaseIndex }
-movq, 2, 0xb0, None, 1, Cpu64, W|ShortForm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm64, Reg64 }
+movq, 2, 0xc6, 0x0, 1, Cpu64, W|Modrm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|HLEPrefixOk=3|Optimize, { Imm32S, Reg64|Qword|Unspecified|BaseIndex }
+movq, 2, 0xb0, None, 1, Cpu64, W|ShortForm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Imm64, Reg64 }
 movq, 2, 0xf37e, None, 1, CpuAVX, Load|Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 movq, 2, 0x66d6, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
 movq, 2, 0x666e, None, 1, CpuAVX|Cpu64, Modrm|Vex=3|VexOpcode=0|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|Qword|Unspecified|BaseIndex, RegXMM }
@@ -1843,8 +1844,8 @@ vaddsd, 3, 0xf258, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|Ign
 vaddss, 3, 0xf358, None, 1, CpuAVX, Modrm|Vex=3|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vaddsubpd, 3, 0x66d0, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vaddsubps, 3, 0xf2d0, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vandnpd, 3, 0x6655, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vandnps, 3, 0x55, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vandnpd, 3, 0x6655, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vandnps, 3, 0x55, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vandpd, 3, 0x6654, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vandps, 3, 0x54, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vblendpd, 4, 0x660d, None, 1, CpuAVX, Modrm|Vex|VexOpcode=2|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
@@ -2275,8 +2276,8 @@ vunpckhpd, 3, 0x6615, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|Ch
 vunpckhps, 3, 0x15, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vunpcklpd, 3, 0x6614, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vunpcklps, 3, 0x14, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vxorpd, 3, 0x6657, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vxorps, 3, 0x57, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vxorpd, 3, 0x6657, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vxorps, 3, 0x57, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vzeroall, 0, 0x77, None, 1, CpuAVX, Vex=2|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 vzeroupper, 0, 0x77, None, 1, CpuAVX, Vex|VexOpcode=0|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { 0 }
 
@@ -2301,7 +2302,7 @@ vpaddusb, 3, 0x66dc, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|
 vpaddusw, 3, 0x66dd, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vpalignr, 4, 0x660f, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=2|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vpand, 3, 0x66db, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vpandn, 3, 0x66df, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vpandn, 3, 0x66df, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vpavgb, 3, 0x66e0, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vpavgw, 3, 0x66e3, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vpblendvb, 4, 0x664c, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=2|VexVVVV=1|VexSources=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|VexImmExt, { RegYMM, Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
@@ -2397,7 +2398,7 @@ vpunpcklbw, 3, 0x6660, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=
 vpunpckldq, 3, 0x6662, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vpunpcklqdq, 3, 0x666c, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 vpunpcklwd, 3, 0x6661, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vpxor, 3, 0x66ef, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vpxor, 3, 0x66ef, None, 1, CpuAVX2, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Ymmword|Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
 
 // New AVX2 instructions.
 
@@ -3935,22 +3936,22 @@ vrsqrt14pd, 2, 0x664E, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=1|V
 
 vpaddd, 3, 0x66FE, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vpandd, 3, 0x66DB, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
-vpandnd, 3, 0x66DF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
+vpandnd, 3, 0x66DF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vpord, 3, 0x66EB, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vpsubd, 3, 0x66FA, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vpunpckhdq, 3, 0x666A, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vpunpckldq, 3, 0x6662, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
-vpxord, 3, 0x66EF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
+vpxord, 3, 0x66EF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 
 vpaddq, 3, 0x66D4, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
-vpandnq, 3, 0x66DF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
+vpandnq, 3, 0x66DF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vpandq, 3, 0x66DB, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vpmuludq, 3, 0x66F4, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vporq, 3, 0x66EB, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vpsubq, 3, 0x66FB, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vpunpckhqdq, 3, 0x666D, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vpunpcklqdq, 3, 0x666C, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
-vpxorq, 3, 0x66EF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
+vpxorq, 3, 0x66EF, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vunpckhpd, 3, 0x6615, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vunpcklpd, 3, 0x6614, None, 1, CpuAVX512F, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 
@@ -4897,7 +4898,7 @@ vpaddd, 3, 0x66FE, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOp
 vpandd, 3, 0x66DB, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vpandd, 3, 0x66DB, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vpandnd, 3, 0x66DF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpandnd, 3, 0x66DF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
+vpandnd, 3, 0x66DF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vpord, 3, 0x66EB, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vpord, 3, 0x66EB, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vprold, 3, 0x6672, 1, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM, RegXMM }
@@ -4929,12 +4930,12 @@ vpunpckhdq, 3, 0x666A, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|V
 vpunpckldq, 3, 0x6662, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vpunpckldq, 3, 0x6662, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vpxord, 3, 0x66EF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpxord, 3, 0x66EF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
+vpxord, 3, 0x66EF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 
 vpaddq, 3, 0x66D4, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vpaddq, 3, 0x66D4, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vpandnq, 3, 0x66DF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpandnq, 3, 0x66DF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
+vpandnq, 3, 0x66DF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vpandq, 3, 0x66DB, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vpandq, 3, 0x66DB, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vpmuludq, 3, 0x66F4, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
@@ -4968,7 +4969,7 @@ vpunpckhqdq, 3, 0x666D, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|
 vpunpcklqdq, 3, 0x666C, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vpunpcklqdq, 3, 0x666C, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vpxorq, 3, 0x66EF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpxorq, 3, 0x66EF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
+vpxorq, 3, 0x66EF, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vshufpd, 4, 0x66C6, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vshufpd, 4, 0x66C6, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vunpckhpd, 3, 0x6615, None, 1, CpuAVX512F|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
@@ -5670,31 +5671,31 @@ ktestw, 2, 0x99, None, 1, CpuAVX512DQ, Modrm|Vex=1|VexOpcode=0|VexW=1|IgnoreSize
 kshiftlb, 3, 0x6632, None, 1, CpuAVX512DQ, Modrm|Vex=1|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegMask, RegMask }
 kshiftrb, 3, 0x6630, None, 1, CpuAVX512DQ, Modrm|Vex=1|VexOpcode=2|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegMask, RegMask }
 
-vandnpd, 3, 0x6655, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
+vandnpd, 3, 0x6655, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vandnpd, 3, 0x6655, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vandnpd, 3, 0x6655, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
+vandnpd, 3, 0x6655, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vandpd, 3, 0x6654, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vandpd, 3, 0x6654, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vandpd, 3, 0x6654, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vorpd, 3, 0x6656, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vorpd, 3, 0x6656, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vorpd, 3, 0x6656, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
-vxorpd, 3, 0x6657, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
+vxorpd, 3, 0x6657, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=2|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Qword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vxorpd, 3, 0x6657, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=4|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vxorpd, 3, 0x6657, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
+vxorpd, 3, 0x6657, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=2|VecESize=1|Broadcast=3|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Qword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 
-vandnps, 3, 0x55, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
+vandnps, 3, 0x55, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vandnps, 3, 0x55, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vandnps, 3, 0x55, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
+vandnps, 3, 0x55, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vandps, 3, 0x54, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vandps, 3, 0x54, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vandps, 3, 0x54, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 vorps, 3, 0x56, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vorps, 3, 0x56, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vorps, 3, 0x56, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
-vxorps, 3, 0x57, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
+vxorps, 3, 0x57, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=1|Disp8MemShift=6|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegZMM|Dword|ZMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 vxorps, 3, 0x57, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=3|Disp8MemShift=4|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Dword|XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vxorps, 3, 0x57, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
+vxorps, 3, 0x57, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=0|VexVVVV=1|VexW=1|Broadcast=2|Disp8MemShift=5|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegYMM|Dword|YMMword|Unspecified|BaseIndex, RegYMM, RegYMM }
 
 vbroadcastf32x2, 2, 0x6619, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=3|VexOpcode=1|VexW=1|Disp8MemShift=3|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegZMM }
 vbroadcastf32x2, 2, 0x6619, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=3|VexOpcode=1|VexW=1|Disp8MemShift=3|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM }


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]