[PATCH 1/2] i386: Generate lfence with load/indirect branch/ret [CVE-2020-0551]

Hongtao Liu crazylht@gmail.com
Wed Apr 22 03:33:25 GMT 2020


On Tue, Apr 21, 2020 at 2:30 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 21.04.2020 04:24, Hongtao Liu wrote:
> > On Mon, Apr 20, 2020 at 3:34 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 20.04.2020 09:20, Hongtao Liu wrote:
> >>> On Thu, Apr 16, 2020 at 4:33 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>>> On 16.04.2020 07:34, Hongtao Liu wrote:
> >>>>> @@ -4506,6 +4520,22 @@ insert_lfence_after (void)
> >>>>> {
> >>>>>   if (lfence_after_load && load_insn_p ())
> >>>>>     {
> >>>>> +      /* Insert lfence after rep cmps/scas only under
> >>>>> +       -mlfence-after-load=all.  */
> >>>>> +      if (((i.tm.base_opcode | 0x1) == 0xa7
> >>>>> +         || (i.tm.base_opcode | 0x1) == 0xaf)
> >>>>> +        && i.prefix[REP_PREFIX])
> >>>>
> >>>> I'm afraid I don't understand why the REP forms need treating
> >>>> differently from the non-REP ones of the same insns.
> >>>>
> >>>
> >>> Not all REP forms, just REP CMPS/SCAS which would change EFLAGS.
> >>
> >> Well, of course just the two. But this doesn't answer my question
> >> as to why there is such a special case.
> >>
> >
> > There are also two REP string instructions that require special
> > treatment. Specifically, the compare string (CMPS) and scan string
> > (SCAS) instructions set EFLAGS in a manner that depends on the data
> > being compared/scanned. When used with a REP prefix, the number of
> > iterations may therefore vary depending on this data. If the data is a
> > program secret chosen by the adversary using an LVI method, then this
> > data-dependent behavior may leak some aspect of the secret. The
> > solution is to unfold any REP CMPS and REP SCAS operations into a loop
> > and insert an LFENCE after the CMPS/SCAS instruction. For example,
> > REPNZ SCAS can be unfolded to:
> >
> > .RepLoop:
> >   JRCXZ .ExitRepLoop
> >   DEC rcx  # or ecx if the REPNZ SCAS uses a 32-bit address size
> >   SCAS
> >   LFENCE
> >   JNZ .RepLoop
> > .ExitRepLoop:
> >   ...
> >
> > The request i get is to add options to handle or not handle REP
> > CMPS/SCAS also plus issue a warning.
>
> But you don't handle them as per what you've written above, afaics.
> Am I overlooking anything?
>

Well, that solution is not meant for gas, i put them here for
convienence of understanding of why we need to handle REP CMPS/SCAS
specially.

> > @@ -647,7 +656,8 @@ static enum lfence_before_ret_kind
> >    {
> >      lfence_before_ret_none = 0,
> >      lfence_before_ret_not,
> > -    lfence_before_ret_or
> > +    lfence_before_ret_or,
> > +    lfence_before_ret_shl
> >    }
> >  lfence_before_ret;
> >
> > @@ -4350,22 +4360,28 @@ load_insn_p (void)
> >
> >    if (!any_vex_p)
> >      {
> > -      /* lea  */
> > -      if (i.tm.base_opcode == 0x8d)
> > +      /* Anysize insns: lea, invlpg, clflush, prefetchnta, prefetcht0,
> > + prefetcht1, prefetcht2, prefetchtw, bndmk, bndcl, bndcu, bndcn,
> > + bndstx, bndldx, prefetchwt1, clflushopt, clwb, cldemote.  */
>
> Bad indentation (also elsewhere, so this may be an issue with your
> mail client)?
>

Yes, tab is ignored when copy into gmail(plain text mode).
I need to manually replace tab with 8 space.

> > @@ -4536,8 +4577,8 @@ insert_lfence_before (void)
> >
> >        if (i.reg_operands == 1)
> >   {
> > -   /* Indirect branch via register.  Don't insert lfence with
> > -      -mlfence-after-load=yes.  */
> > +   /* Indirect branch via register. Insert lfence when
> > +      -mlfence-after-load=none.  */
> >     if (lfence_after_load
> >         || lfence_before_indirect_branch == lfence_branch_memory)
> >       return;
>
> The changed comment is awkward to read - the reader will almost
> certainly wonder why "none" implies an action. I think you either
> want to explain this further, or revert back to the original form
> by simply making it "... with -mlfence-after-load={all,general}."
>
Changed.
> > @@ -4568,12 +4609,13 @@ insert_lfence_before (void)
> >        return;
> >      }
> >
> > -  /* Output or/not and lfence before ret.  */
> > +  /* Output or/not/shl and lfence before ret/lret/iret.  */
> >    if (lfence_before_ret != lfence_before_ret_none
> >        && (i.tm.base_opcode == 0xc2
> >     || i.tm.base_opcode == 0xc3
> >     || i.tm.base_opcode == 0xca
> > -   || i.tm.base_opcode == 0xcb))
> > +   || i.tm.base_opcode == 0xcb
> > +   || i.tm.base_opcode == 0xcf))
> >      {
> >        if (last_insn.kind != last_insn_other
> >     && last_insn.seg == now_seg)
> > @@ -4583,33 +4625,50 @@ insert_lfence_before (void)
> >   last_insn.name, i.tm.name);
> >     return;
> >   }
> > -      if (lfence_before_ret == lfence_before_ret_or)
> > - {
> > -   /* orl: 0x830c2400.  */
> > -   p = frag_more ((flag_code == CODE_64BIT ? 1 : 0) + 4 + 3);
> > -   if (flag_code == CODE_64BIT)
> > -     *p++ = 0x48;
> > -   *p++ = 0x83;
> > -   *p++ = 0xc;
> > -   *p++ = 0x24;
> > -   *p++ = 0x0;
> > - }
> > -      else
> > +
> > +      char prefix = i.prefix[DATA_PREFIX] && !(i.prefix[REX_PREFIX] & REX_W)
> > + ? 0x66 : flag_code == CODE_64BIT ? 0x48 : 0x0;
>
> While this now looks better, it's tailored to near RET. Far RET
> as well as IRET default to 32-bit operand size in 64-bit mode.
> I can't tell how relevant it is to match effective operand size
> of the guarded and guarding insns.
>
Changed

Update patch.

>From 26d23fc18e090799872057bec21831c02e8b5d03 Mon Sep 17 00:00:00 2001
From: liuhongt <hongtao.liu@intel.com>
Date: Mon, 16 Mar 2020 11:03:12 +0800
Subject: [PATCH] Improve -mlfence-after-load

  1.Implict load for POP/POPF/POPA/XLATB, no load for Anysize insns
  2. Add -mlfence-before-ret=shl/yes, adjust operand size of
  or/not/shl according to ret's.
  3. Ajust -mlfence-after-load=[yes/no] to
  -mlfence-after-load=[none|general|all]. -mlfence-after-load=[none/all]
  equal original -mlfence-after-load=[no/yes],
  -mlfence-after-load=general won't add lfence after REP CMPS/SCAS
  since they would affect control flow behavior.
  -mlfence-after-load=all will issue an warning when adding lfence
  after REP CMPS/SCAS.
  4. Adjust testcases and documents.

gas/Changelog:
        * config/tc-i386.c (lfence_after_load) Deleted.
        (lfence_after_load_kind): New.
        (lfence_before_ret_shl): New member.
        (load_insn_p): implict load for POP/POPA/POPF/XLATB, no load
        for Anysize insns.
        (insert_after_load): Handle specially for REP CMPS/SCAS.
        (insert_before_before): Handle iret, Handle
        -mlfence-before-ret=shl, Adjust operand size of or/not/shl to ret's,
        (md_parse_option): Change -mlfence-after-load=[yes|no] to
        -mlfence-after-load=[none|general|all], Change
        -mlfence-before-ret=[none|not|or] to
        -mlfence-before-ret=[none/not/or/shl/yes].
        Enable -mlfence-before-ret=shl when
        -mlfence-beofre-indirect-branch=all and no explict
-mlfence-before-ret option.
        (md_show_usage): Ditto.
        * doc/c-i386.texi: Ditto.
        * testsuite/gas/i386/i386.exp: Add new testcases.
        * testsuite/gas/i386/lfence-load-b.d: New.
        * testsuite/gas/i386/lfence-load-b.e: New.
        * testsuite/gas/i386/lfence-load.d: Modified.
        * testsuite/gas/i386/lfence-load.e: New.
        * testsuite/gas/i386/lfence-load.s: Modified.
        * testsuite/gas/i386/lfence-ret-a.d: Modified.
        * testsuite/gas/i386/lfence-ret-b.d: Modified.
        * testsuite/gas/i386/lfence-ret-c.d: New.
        * testsuite/gas/i386/lfence-ret-d.d: New.
        * testsuite/gas/i386/lfence-ret.s: Modified.
        * testsuite/gas/i386/x86-64-lfence-load-b.d: New.
        * testsuite/gas/i386/x86-64-lfence-load.d: Modified.
        * testsuite/gas/i386/x86-64-lfence-load.s: Modified.
        * testsuite/gas/i386/x86-64-lfence-ret-a.d: Modified.
        * testsuite/gas/i386/x86-64-lfence-ret-b.d: Modified.
        * testsuite/gas/i386/x86-64-lfence-ret-c.d: New.
        * testsuite/gas/i386/x86-64-lfence-ret-d.d: New
        * testsuite/gas/i386/x86-64-lfence-ret-e.d: New.
        * testsuite/gas/i386/x86-64-lfence-ret.e: New.
        * testsuite/gas/i386/x86-64-lfence-ret.s: New.
---
 gas/config/tc-i386.c                          | 154 +++++++++++++-----
 gas/doc/c-i386.texi                           |  31 ++--
 gas/testsuite/gas/i386/i386.exp               |   7 +
 gas/testsuite/gas/i386/lfence-load-b.d        | 137 ++++++++++++++++
 gas/testsuite/gas/i386/lfence-load-b.e        |   3 +
 gas/testsuite/gas/i386/lfence-load.d          |  30 +++-
 gas/testsuite/gas/i386/lfence-load.e          |   3 +
 gas/testsuite/gas/i386/lfence-load.s          |  20 +++
 gas/testsuite/gas/i386/lfence-ret-a.d         |  18 ++
 gas/testsuite/gas/i386/lfence-ret-b.d         |  24 +++
 gas/testsuite/gas/i386/lfence-ret-c.d         |  35 ++++
 gas/testsuite/gas/i386/lfence-ret-d.d         |  36 ++++
 gas/testsuite/gas/i386/lfence-ret.s           |   6 +
 gas/testsuite/gas/i386/x86-64-lfence-load-b.d | 137 ++++++++++++++++
 gas/testsuite/gas/i386/x86-64-lfence-load.d   |  28 +++-
 gas/testsuite/gas/i386/x86-64-lfence-load.s   |  19 +++
 gas/testsuite/gas/i386/x86-64-lfence-ret-a.d  |  33 +++-
 gas/testsuite/gas/i386/x86-64-lfence-ret-b.d  |  43 ++++-
 gas/testsuite/gas/i386/x86-64-lfence-ret-c.d  |  48 ++++++
 gas/testsuite/gas/i386/x86-64-lfence-ret-d.d  |  49 ++++++
 gas/testsuite/gas/i386/x86-64-lfence-ret-e.d  |  49 ++++++
 gas/testsuite/gas/i386/x86-64-lfence-ret.e    |   3 +
 gas/testsuite/gas/i386/x86-64-lfence-ret.s    |  14 ++
 23 files changed, 870 insertions(+), 57 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/lfence-load-b.d
 create mode 100644 gas/testsuite/gas/i386/lfence-load-b.e
 create mode 100644 gas/testsuite/gas/i386/lfence-load.e
 create mode 100644 gas/testsuite/gas/i386/lfence-ret-c.d
 create mode 100644 gas/testsuite/gas/i386/lfence-ret-d.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-lfence-load-b.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-lfence-ret-c.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-lfence-ret-d.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-lfence-ret-e.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-lfence-ret.e
 create mode 100644 gas/testsuite/gas/i386/x86-64-lfence-ret.s

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 093497becd..7454f2987f 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -629,8 +629,17 @@ static int omit_lock_prefix = 0;
    "lock addl $0, (%{re}sp)".  */
 static int avoid_fence = 0;

-/* 1 if lfence should be inserted after every load.  */
-static int lfence_after_load = 0;
+/* Non-zero if lfence should be inserted after load.
+   lfence_load_all will generate lfence for all load instructions,
+   lfence_load_general will generate lfence for all
+   load instruction except REP CMPS/SCAS.  */
+static enum lfence_after_load_kind
+  {
+   lfence_load_none = 0,
+   lfence_load_general,
+   lfence_load_all
+  }
+lfence_after_load;

 /* Non-zero if lfence should be inserted before indirect branch.  */
 static enum lfence_before_indirect_branch_kind
@@ -647,7 +656,8 @@ static enum lfence_before_ret_kind
   {
     lfence_before_ret_none = 0,
     lfence_before_ret_not,
-    lfence_before_ret_or
+    lfence_before_ret_or,
+    lfence_before_ret_shl
   }
 lfence_before_ret;

@@ -4350,22 +4360,28 @@ load_insn_p (void)

   if (!any_vex_p)
     {
-      /* lea  */
-      if (i.tm.base_opcode == 0x8d)
+      /* Anysize insns: lea, invlpg, clflush, prefetchnta, prefetcht0,
+         prefetcht1, prefetcht2, prefetchtw, bndmk, bndcl, bndcu, bndcn,
+         bndstx, bndldx, prefetchwt1, clflushopt, clwb, cldemote.  */
+      if (i.tm.opcode_modifier.anysize)
         return 0;

-      /* pop  */
-      if ((i.tm.base_opcode & ~7) == 0x58
-          || (i.tm.base_opcode == 0x8f && i.tm.extension_opcode == 0))
+      /* pop, popf, popa.   */
+      if (strcmp (i.tm.name, "pop") == 0
+          || i.tm.base_opcode == 0x9d
+          || i.tm.base_opcode == 0x61)
         return 1;

       /* movs, cmps, lods, scas.  */
       if ((i.tm.base_opcode | 0xb) == 0xaf)
         return 1;

-      /* outs */
-      if (base_opcode == 0x6f)
+      /* outs, xlatb.  */
+      if (base_opcode == 0x6f
+          || i.tm.base_opcode == 0xd7)
         return 1;
+      /* NB: For AMD-specific insns with implicit memory operands,
+         they're intentionally not covered.  */
     }

   /* No memory operand.  */
@@ -4506,6 +4522,31 @@ insert_lfence_after (void)
 {
   if (lfence_after_load && load_insn_p ())
     {
+      /* Insert lfence after rep cmps/scas only under
+         -mlfence-after-load=all.  */
+      /* There are also two REP string instructions that require
+         special treatment. Specifically, the compare string (CMPS)
+         and scan string (SCAS) instructions set EFLAGS in a manner
+         that depends on the data being compared/scanned. When used
+         with a REP prefix, the number of iterations may therefore
+         vary depending on this data. If the data is a program secret
+         chosen by the adversary using an LVI method,
+         then this data-dependent behavior may leak some aspect
+         of the secret.  */
+      if (((i.tm.base_opcode | 0x1) == 0xa7
+           || (i.tm.base_opcode | 0x1) == 0xaf)
+          && i.prefix[REP_PREFIX])
+        {
+          if (lfence_after_load == lfence_load_general)
+            {
+              as_warn (_("`%s` skips -mlfence-after-load=general"),
+                       i.tm.name);
+              return;
+            }
+          else
+            as_warn (_("`%s` changes flags which would affect control
flow behavior"),
+                     i.tm.name);
+        }
       char *p = frag_more (3);
       *p++ = 0xf;
       *p++ = 0xae;
@@ -4536,8 +4577,8 @@ insert_lfence_before (void)

       if (i.reg_operands == 1)
         {
-          /* Indirect branch via register.  Don't insert lfence with
-             -mlfence-after-load=yes.  */
+          /* Indirect branch via register. Don't insert lfence with
+             -mlfence-after-load={general,all}.  */
           if (lfence_after_load
               || lfence_before_indirect_branch == lfence_branch_memory)
             return;
@@ -4568,12 +4609,13 @@ insert_lfence_before (void)
       return;
     }

-  /* Output or/not and lfence before ret.  */
+  /* Output or/not/shl and lfence before ret/lret/iret.  */
   if (lfence_before_ret != lfence_before_ret_none
       && (i.tm.base_opcode == 0xc2
           || i.tm.base_opcode == 0xc3
           || i.tm.base_opcode == 0xca
-          || i.tm.base_opcode == 0xcb))
+          || i.tm.base_opcode == 0xcb
+          || i.tm.base_opcode == 0xcf))
     {
       if (last_insn.kind != last_insn_other
           && last_insn.seg == now_seg)
@@ -4583,33 +4625,59 @@ insert_lfence_before (void)
                          last_insn.name, i.tm.name);
           return;
         }
-      if (lfence_before_ret == lfence_before_ret_or)
-        {
-          /* orl: 0x830c2400.  */
-          p = frag_more ((flag_code == CODE_64BIT ? 1 : 0) + 4 + 3);
-          if (flag_code == CODE_64BIT)
-            *p++ = 0x48;
-          *p++ = 0x83;
-          *p++ = 0xc;
-          *p++ = 0x24;
-          *p++ = 0x0;
-        }
+
+      bfd_boolean lret = (i.tm.base_opcode | 0x1) == 0xcb;
+      bfd_boolean has_rexw = i.prefix[REX_PREFIX] & REX_W;
+      char prefix = 0x0;
+      /* Default operand size for far return is 32 bits,
+         64 bits for near return.  */
+      if (has_rexw)
+        prefix = 0x48;
       else
+        prefix = i.prefix[DATA_PREFIX]
+                 ? 0x66
+                 : !lret && flag_code == CODE_64BIT ? 0x48 : 0x0;
+
+      if (lfence_before_ret == lfence_before_ret_not)
         {
-          p = frag_more ((flag_code == CODE_64BIT ? 2 : 0) + 6 + 3);
-          /* notl: 0xf71424.  */
-          if (flag_code == CODE_64BIT)
-            *p++ = 0x48;
+          /* not: 0xf71424, may add prefix
+             for operand size overwrite or 64-bit code.  */
+          p = frag_more ((prefix ? 2 : 0) + 6 + 3);
+          if (prefix)
+            *p++ = prefix;
           *p++ = 0xf7;
           *p++ = 0x14;
           *p++ = 0x24;
-          /* notl: 0xf71424.  */
-          if (flag_code == CODE_64BIT)
-            *p++ = 0x48;
+          if (prefix)
+            *p++ = prefix;
           *p++ = 0xf7;
           *p++ = 0x14;
           *p++ = 0x24;
         }
+      else
+        {
+          p = frag_more ((prefix ? 1 : 0) + 4 + 3);
+          if (prefix)
+            *p++ = prefix;
+          if (lfence_before_ret == lfence_before_ret_or)
+            {
+              /* or: 0x830c2400, may add prefix
+                 for operand size overwrite or 64-bit code.  */
+              *p++ = 0x83;
+              *p++ = 0x0c;
+            }
+          else
+            {
+              /* shl: 0xc1242400, may add prefix
+                 for operand size overwrite or 64-bit code.  */
+              *p++ = 0xc1;
+              *p++ = 0x24;
+            }
+
+          *p++ = 0x24;
+          *p++ = 0x0;
+        }
+
       *p++ = 0xf;
       *p++ = 0xae;
       *p = 0xe8;
@@ -12985,17 +13053,23 @@ md_parse_option (int c, const char *arg)
       break;

     case OPTION_MLFENCE_AFTER_LOAD:
-      if (strcasecmp (arg, "yes") == 0)
-        lfence_after_load = 1;
-      else if (strcasecmp (arg, "no") == 0)
-        lfence_after_load = 0;
+      if (strcasecmp (arg, "general") == 0)
+        lfence_after_load = lfence_load_general;
+      else if (strcasecmp (arg, "all") == 0)
+        lfence_after_load = lfence_load_all;
+      else if (strcasecmp (arg, "none") == 0)
+        lfence_after_load = lfence_load_none;
       else
         as_fatal (_("invalid -mlfence-after-load= option: `%s'"), arg);
       break;

     case OPTION_MLFENCE_BEFORE_INDIRECT_BRANCH:
       if (strcasecmp (arg, "all") == 0)
-        lfence_before_indirect_branch = lfence_branch_all;
+        {
+          lfence_before_indirect_branch = lfence_branch_all;
+          if (lfence_before_ret == lfence_before_ret_none)
+            lfence_before_ret = lfence_before_ret_shl;
+        }
       else if (strcasecmp (arg, "memory") == 0)
         lfence_before_indirect_branch = lfence_branch_memory;
       else if (strcasecmp (arg, "register") == 0)
@@ -13012,6 +13086,8 @@ md_parse_option (int c, const char *arg)
         lfence_before_ret = lfence_before_ret_or;
       else if (strcasecmp (arg, "not") == 0)
         lfence_before_ret = lfence_before_ret_not;
+      else if (strcasecmp (arg, "shl") == 0 || strcasecmp (arg, "yes") == 0)
+        lfence_before_ret = lfence_before_ret_shl;
       else if (strcasecmp (arg, "none") == 0)
         lfence_before_ret = lfence_before_ret_none;
       else
@@ -13376,13 +13452,13 @@ md_show_usage (FILE *stream)
   -mbranches-within-32B-boundaries\n\
                           align branches within 32 byte boundary\n"));
   fprintf (stream, _("\
-  -mlfence-after-load=[no|yes] (default: no)\n\
+  -mlfence-after-load=[none|general|all] (default: none)\n\
                           generate lfence after load\n"));
   fprintf (stream, _("\
   -mlfence-before-indirect-branch=[none|all|register|memory] (default: none)\n\
                           generate lfence before indirect near branch\n"));
   fprintf (stream, _("\
-  -mlfence-before-ret=[none|or|not] (default: none)\n\
+  -mlfence-before-ret=[none|or|not|shl|yes] (default: none)\n\
                           generate lfence before ret\n"));
   fprintf (stream, _("\
   -mamd64                 accept only AMD64 ISA [default]\n"));
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index 628fb1ad5a..19a4bf874e 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -470,12 +470,15 @@ The default doesn't align branches.

 @cindex @samp{-mlfence-after-load=} option, i386
 @cindex @samp{-mlfence-after-load=} option, x86-64
-@item -mlfence-after-load=@var{no}
-@itemx -mlfence-after-load=@var{yes}
+@item -mlfence-after-load=@var{none}
+@item -mlfence-after-load=@var{general}
+@itemx -mlfence-after-load=@var{all}
 These options control whether the assembler should generate lfence
-after load instructions.  @option{-mlfence-after-load=@var{yes}} will
-generate lfence.  @option{-mlfence-after-load=@var{no}} will not generate
-lfence, which is the default.
+after load instructions.  @option{-mlfence-after-load=@var{all}} will
+generate lfence for all load instructions,
+@option{-mlfence-after-load=@var{general}}will generate lfence for all
+load instruction except rep cmps/scas, @option{-mlfence-after-load=@var{none}}
+will not generate lfence, which is the default.

 @cindex @samp{-mlfence-before-indirect-branch=} option, i386
 @cindex @samp{-mlfence-before-indirect-branch=} option, x86-64
@@ -488,28 +491,32 @@ before indirect near branch instructions.
 @option{-mlfence-before-indirect-branch=@var{all}} will generate lfence
 before indirect near branch via register and issue a warning before
 indirect near branch via memory.
+It also implicitly sets @option{-mlfence-before-ret=@var{shl}} when
+there's no explict @option{-mlfence-before-ret=}.
 @option{-mlfence-before-indirect-branch=@var{register}} will generate
 lfence before indirect near branch via register.
 @option{-mlfence-before-indirect-branch=@var{memory}} will issue a
 warning before indirect near branch via memory.
 @option{-mlfence-before-indirect-branch=@var{none}} will not generate
-lfence nor issue warning, which is the default.  Note that lfence won't
-be generated before indirect near branch via register with
-@option{-mlfence-after-load=@var{yes}} since lfence will be generated
+lfence nor issue warning, which is the default.  Note that lfence will
+generate before indirect near branch via register only with
+@option{-mlfence-after-load=@var{none}} since lfence will be generated
 after loading branch target register.

 @cindex @samp{-mlfence-before-ret=} option, i386
 @cindex @samp{-mlfence-before-ret=} option, x86-64
 @item -mlfence-before-ret=@var{none}
+@item -mlfence-before-ret=@var{shl}
 @item -mlfence-before-ret=@var{or}
+@item -mlfence-before-ret=@var{yes}
 @itemx -mlfence-before-ret=@var{not}
 These options control whether the assembler should generate lfence
 before ret.  @option{-mlfence-before-ret=@var{or}} will generate
 generate or instruction with lfence.
-@option{-mlfence-before-ret=@var{not}} will generate not instruction
-with lfence.
-@option{-mlfence-before-ret=@var{none}} will not generate lfence,
-which is the default.
+@option{-mlfence-before-ret=@var{shl/yes}} will generate shl instruction
+with lfence. @option{-mlfence-before-ret=@var{not}} will generate not
+instruction with lfence. @option{-mlfence-before-ret=@var{none}} will not
+generate lfence, which is the default.

 @cindex @samp{-mx86-used-note=} option, i386
 @cindex @samp{-mx86-used-note=} option, x86-64
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index 9dacc11906..bb3897b9ad 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -530,11 +530,14 @@ if [expr ([istarget "i*86-*-*"] ||  [istarget
"x86_64-*-*"]) && [gas_32_check]]
     run_dump_test "align-branch-8"
     run_dump_test "align-branch-9"
     run_dump_test "lfence-load"
+    run_dump_test "lfence-load-b"
     run_dump_test "lfence-indbr-a"
     run_dump_test "lfence-indbr-b"
     run_dump_test "lfence-indbr-c"
     run_dump_test "lfence-ret-a"
     run_dump_test "lfence-ret-b"
+    run_dump_test "lfence-ret-c"
+    run_dump_test "lfence-ret-d"
     run_dump_test "lfence-byte"

     # These tests require support for 8 and 16 bit relocs,
@@ -1117,11 +1120,15 @@ if [expr ([istarget "i*86-*-*"] || [istarget
"x86_64-*-*"]) && [gas_64_check]] t
     run_dump_test "x86-64-align-branch-8"
     run_dump_test "x86-64-align-branch-9"
     run_dump_test "x86-64-lfence-load"
+    run_dump_test "x86-64-lfence-load-b"
     run_dump_test "x86-64-lfence-indbr-a"
     run_dump_test "x86-64-lfence-indbr-b"
     run_dump_test "x86-64-lfence-indbr-c"
     run_dump_test "x86-64-lfence-ret-a"
     run_dump_test "x86-64-lfence-ret-b"
+    run_dump_test "x86-64-lfence-ret-c"
+    run_dump_test "x86-64-lfence-ret-d"
+    run_dump_test "x86-64-lfence-ret-e"
     run_dump_test "x86-64-lfence-byte"

     if { ![istarget "*-*-aix*"]
diff --git a/gas/testsuite/gas/i386/lfence-load-b.d
b/gas/testsuite/gas/i386/lfence-load-b.d
new file mode 100644
index 0000000000..b4f7bc0f19
--- /dev/null
+++ b/gas/testsuite/gas/i386/lfence-load-b.d
@@ -0,0 +1,137 @@
+#source: lfence-load.s
+#as: -mlfence-after-load=general
+#objdump: -dw
+#warning_output: lfence-load-b.e
+#name: lfence-load-b
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+: c5 f8 ae 55 00        vldmxcsr 0x0\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 0f 01 55 00          lgdtl  0x0\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 0f c7 75 00          vmptrld 0x0\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 0f c7 75 00        vmclear 0x0\(%ebp\)
+ +[a-f0-9]+: 66 0f 38 82 55 00    invpcid 0x0\(%ebp\),%edx
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 0f 01 7d 00          invlpg 0x0\(%ebp\)
+ +[a-f0-9]+: 0f ae 7d 00          clflush 0x0\(%ebp\)
+ +[a-f0-9]+: 66 0f ae 7d 00        clflushopt 0x0\(%ebp\)
+ +[a-f0-9]+: 66 0f ae 75 00        clwb   0x0\(%ebp\)
+ +[a-f0-9]+: 0f 1c 45 00          cldemote 0x0\(%ebp\)
+ +[a-f0-9]+: f3 0f 1b 4d 00        bndmk  0x0\(%ebp\),%bnd1
+ +[a-f0-9]+: f3 0f 1a 4d 00        bndcl  0x0\(%ebp\),%bnd1
+ +[a-f0-9]+: f2 0f 1a 4d 00        bndcu  0x0\(%ebp\),%bnd1
+ +[a-f0-9]+: f2 0f 1b 4d 00        bndcn  0x0\(%ebp\),%bnd1
+ +[a-f0-9]+: 0f 1b 4d 00          bndstx %bnd1,0x0\(%ebp\)
+ +[a-f0-9]+: 0f 1a 4d 00          bndldx 0x0\(%ebp\),%bnd1
+ +[a-f0-9]+: 0f 18 4d 00          prefetcht0 0x0\(%ebp\)
+ +[a-f0-9]+: 0f 18 55 00          prefetcht1 0x0\(%ebp\)
+ +[a-f0-9]+: 0f 18 5d 00          prefetcht2 0x0\(%ebp\)
+ +[a-f0-9]+: 0f 0d 4d 00          prefetchw 0x0\(%ebp\)
+ +[a-f0-9]+: 1f                    pop    %ds
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 9d                    popf
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 61                    popa
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: d7                    xlat   %ds:\(%ebx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: d9 55 00              fsts   0x0\(%ebp\)
+ +[a-f0-9]+: d9 45 00              flds   0x0\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: db 55 00              fistl  0x0\(%ebp\)
+ +[a-f0-9]+: df 55 00              fists  0x0\(%ebp\)
+ +[a-f0-9]+: db 45 00              fildl  0x0\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: df 45 00              filds  0x0\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 9b dd 75 00          fsave  0x0\(%ebp\)
+ +[a-f0-9]+: dd 65 00              frstor 0x0\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: df 45 00              filds  0x0\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: df 4d 00              fisttps 0x0\(%ebp\)
+ +[a-f0-9]+: d9 65 00              fldenv 0x0\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 9b d9 75 00          fstenv 0x0\(%ebp\)
+ +[a-f0-9]+: d8 45 00              fadds  0x0\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: d8 04 24              fadds  \(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: d8 c3                fadd   %st\(3\),%st
+ +[a-f0-9]+: d8 01                fadds  \(%ecx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: df 01                filds  \(%ecx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: df 11                fists  \(%ecx\)
+ +[a-f0-9]+: 0f ae 29              xrstor \(%ecx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 0f 18 01              prefetchnta \(%ecx\)
+ +[a-f0-9]+: 0f c7 09              cmpxchg8b \(%ecx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 41                    inc    %ecx
+ +[a-f0-9]+: 0f 01 10              lgdtl  \(%eax\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 0f 0f 66 02 b0        pfcmpeq 0x2\(%esi\),%mm4
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 8f 00                popl   \(%eax\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 58                    pop    %eax
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 d1 11              rclw   \(%ecx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f7 01 01 00 00 00    testl  \$0x1,\(%ecx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ff 01                incl   \(%ecx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f7 11                notl   \(%ecx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f7 31                divl   \(%ecx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f7 21                mull   \(%ecx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f7 39                idivl  \(%ecx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f7 29                imull  \(%ecx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 8d 04 40              lea    \(%eax,%eax,2\),%eax
+ +[a-f0-9]+: c9                    leave
+ +[a-f0-9]+: 6e                    outsb  %ds:\(%esi\),\(%dx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ac                    lods   %ds:\(%esi\),%al
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f3 a5                rep movsl %ds:\(%esi\),%es:\(%edi\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f3 af                repz scas %es:\(%edi\),%eax
+ +[a-f0-9]+: f3 a7                repz cmpsl %es:\(%edi\),%ds:\(%esi\)
+ +[a-f0-9]+: f3 ad                rep lods %ds:\(%esi\),%eax
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 83 00 01              addl   \$0x1,\(%eax\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 0f ba 20 01          btl    \$0x1,\(%eax\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 0f c1 03              xadd   %eax,\(%ebx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 0f c1 c3              xadd   %eax,%ebx
+ +[a-f0-9]+: 87 03                xchg   %eax,\(%ebx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 93                    xchg   %eax,%ebx
+ +[a-f0-9]+: 39 45 40              cmp    %eax,0x40\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 3b 45 40              cmp    0x40\(%ebp\),%eax
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 01 45 40              add    %eax,0x40\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 03 00                add    \(%eax\),%eax
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 85 45 40              test   %eax,0x40\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 85 45 40              test   %eax,0x40\(%ebp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+#pass
diff --git a/gas/testsuite/gas/i386/lfence-load-b.e
b/gas/testsuite/gas/i386/lfence-load-b.e
new file mode 100644
index 0000000000..c394e02296
--- /dev/null
+++ b/gas/testsuite/gas/i386/lfence-load-b.e
@@ -0,0 +1,3 @@
+.*: Assembler messages:
+.*:??: Warning: `scas` skips -mlfence-after-load=general
+.*:??: Warning: `cmps` skips -mlfence-after-load=general
\ No newline at end of file
diff --git a/gas/testsuite/gas/i386/lfence-load.d
b/gas/testsuite/gas/i386/lfence-load.d
index cd7e7f76df..273e302f38 100644
--- a/gas/testsuite/gas/i386/lfence-load.d
+++ b/gas/testsuite/gas/i386/lfence-load.d
@@ -1,6 +1,7 @@
-#as: -mlfence-after-load=yes
+#as: -mlfence-after-load=all
 #objdump: -dw
-#name: -mlfence-after-load=yes
+#warning_output: lfence-load.e
+#name: -mlfence-after-load=all

 .*: +file format .*

@@ -15,6 +16,31 @@ Disassembly of section .text:
  +[a-f0-9]+: 0f c7 75 00          vmptrld 0x0\(%ebp\)
  +[a-f0-9]+: 0f ae e8              lfence
  +[a-f0-9]+: 66 0f c7 75 00        vmclear 0x0\(%ebp\)
+ +[a-f0-9]+: 66 0f 38 82 55 00    invpcid 0x0\(%ebp\),%edx
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 0f 01 7d 00          invlpg 0x0\(%ebp\)
+ +[a-f0-9]+: 0f ae 7d 00          clflush 0x0\(%ebp\)
+ +[a-f0-9]+: 66 0f ae 7d 00        clflushopt 0x0\(%ebp\)
+ +[a-f0-9]+: 66 0f ae 75 00        clwb   0x0\(%ebp\)
+ +[a-f0-9]+: 0f 1c 45 00          cldemote 0x0\(%ebp\)
+ +[a-f0-9]+: f3 0f 1b 4d 00        bndmk  0x0\(%ebp\),%bnd1
+ +[a-f0-9]+: f3 0f 1a 4d 00        bndcl  0x0\(%ebp\),%bnd1
+ +[a-f0-9]+: f2 0f 1a 4d 00        bndcu  0x0\(%ebp\),%bnd1
+ +[a-f0-9]+: f2 0f 1b 4d 00        bndcn  0x0\(%ebp\),%bnd1
+ +[a-f0-9]+: 0f 1b 4d 00          bndstx %bnd1,0x0\(%ebp\)
+ +[a-f0-9]+: 0f 1a 4d 00          bndldx 0x0\(%ebp\),%bnd1
+ +[a-f0-9]+: 0f 18 4d 00          prefetcht0 0x0\(%ebp\)
+ +[a-f0-9]+: 0f 18 55 00          prefetcht1 0x0\(%ebp\)
+ +[a-f0-9]+: 0f 18 5d 00          prefetcht2 0x0\(%ebp\)
+ +[a-f0-9]+: 0f 0d 4d 00          prefetchw 0x0\(%ebp\)
+ +[a-f0-9]+: 1f                    pop    %ds
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 9d                    popf
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 61                    popa
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: d7                    xlat   %ds:\(%ebx\)
+ +[a-f0-9]+: 0f ae e8              lfence
  +[a-f0-9]+: d9 55 00              fsts   0x0\(%ebp\)
  +[a-f0-9]+: d9 45 00              flds   0x0\(%ebp\)
  +[a-f0-9]+: 0f ae e8              lfence
diff --git a/gas/testsuite/gas/i386/lfence-load.e
b/gas/testsuite/gas/i386/lfence-load.e
new file mode 100644
index 0000000000..1ee49da7fd
--- /dev/null
+++ b/gas/testsuite/gas/i386/lfence-load.e
@@ -0,0 +1,3 @@
+.*: Assembler messages:
+.*:??: Warning: `scas` changes flags which would affect control flow behavior
+.*:??: Warning: `cmps` changes flags which would affect control flow behavior
diff --git a/gas/testsuite/gas/i386/lfence-load.s
b/gas/testsuite/gas/i386/lfence-load.s
index b417ac644e..4b4aa1610b 100644
--- a/gas/testsuite/gas/i386/lfence-load.s
+++ b/gas/testsuite/gas/i386/lfence-load.s
@@ -4,6 +4,26 @@ _start:
  lgdt (%ebp)
  vmptrld (%ebp)
  vmclear (%ebp)
+ invpcid (%ebp), %edx
+ invlpg (%ebp)
+ clflush (%ebp)
+ clflushopt (%ebp)
+ clwb (%ebp)
+ cldemote (%ebp)
+ bndmk (%ebp), %bnd1
+ bndcl (%ebp), %bnd1
+ bndcu (%ebp), %bnd1
+ bndcn (%ebp), %bnd1
+ bndstx %bnd1, (%ebp)
+ bndldx (%ebp), %bnd1
+ prefetcht0 (%ebp)
+ prefetcht1 (%ebp)
+ prefetcht2 (%ebp)
+ prefetchw (%ebp)
+ pop %ds
+ popf
+ popa
+ xlatb (%ebx)
  fsts (%ebp)
  flds (%ebp)
  fistl (%ebp)
diff --git a/gas/testsuite/gas/i386/lfence-ret-a.d
b/gas/testsuite/gas/i386/lfence-ret-a.d
index 719cf1b472..aa35857664 100644
--- a/gas/testsuite/gas/i386/lfence-ret-a.d
+++ b/gas/testsuite/gas/i386/lfence-ret-a.d
@@ -9,10 +9,28 @@
 Disassembly of section .text:

 0+ <_start>:
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c3                retw
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c2 14 00          retw   \$0x14
  +[a-f0-9]+: 83 0c 24 00          orl    \$0x0,\(%esp\)
  +[a-f0-9]+: 0f ae e8              lfence
  +[a-f0-9]+: c3                    ret
  +[a-f0-9]+: 83 0c 24 00          orl    \$0x0,\(%esp\)
  +[a-f0-9]+: 0f ae e8              lfence
  +[a-f0-9]+: c2 1e 00              ret    \$0x1e
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 cb                lretw
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 ca 28 00          lretw  \$0x28
+ +[a-f0-9]+: 83 0c 24 00          orl    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: cb                    lret
+ +[a-f0-9]+: 83 0c 24 00          orl    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ca 28 00              lret   \$0x28
 #pass
diff --git a/gas/testsuite/gas/i386/lfence-ret-b.d
b/gas/testsuite/gas/i386/lfence-ret-b.d
index e3914b9c28..77001c425e 100644
--- a/gas/testsuite/gas/i386/lfence-ret-b.d
+++ b/gas/testsuite/gas/i386/lfence-ret-b.d
@@ -9,6 +9,14 @@
 Disassembly of section .text:

 0+ <_start>:
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%esp\)
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c3                retw
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%esp\)
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c2 14 00          retw   \$0x14
  +[a-f0-9]+: f7 14 24              notl   \(%esp\)
  +[a-f0-9]+: f7 14 24              notl   \(%esp\)
  +[a-f0-9]+: 0f ae e8              lfence
@@ -17,4 +25,20 @@ Disassembly of section .text:
  +[a-f0-9]+: f7 14 24              notl   \(%esp\)
  +[a-f0-9]+: 0f ae e8              lfence
  +[a-f0-9]+: c2 1e 00              ret    \$0x1e
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%esp\)
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 cb                lretw
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%esp\)
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 ca 28 00          lretw  \$0x28
+ +[a-f0-9]+: f7 14 24              notl   \(%esp\)
+ +[a-f0-9]+: f7 14 24              notl   \(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: cb                    lret
+ +[a-f0-9]+: f7 14 24              notl   \(%esp\)
+ +[a-f0-9]+: f7 14 24              notl   \(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ca 28 00              lret   \$0x28
 #pass
diff --git a/gas/testsuite/gas/i386/lfence-ret-c.d
b/gas/testsuite/gas/i386/lfence-ret-c.d
new file mode 100644
index 0000000000..fceb0eb182
--- /dev/null
+++ b/gas/testsuite/gas/i386/lfence-ret-c.d
@@ -0,0 +1,35 @@
+#source: lfence-ret.s
+#as: -mlfence-before-ret=or -mlfence-before-indirect-branch=all
+#objdump: -dw
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c3                retw
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c2 14 00          retw   \$0x14
+ +[a-f0-9]+: 83 0c 24 00          orl    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: c3                    ret
+ +[a-f0-9]+: 83 0c 24 00          orl    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: c2 1e 00              ret    \$0x1e
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 cb                lretw
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 ca 28 00          lretw  \$0x28
+ +[a-f0-9]+: 83 0c 24 00          orl    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: cb                    lret
+ +[a-f0-9]+: 83 0c 24 00          orl    \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ca 28 00              lret   \$0x28
+#pass
diff --git a/gas/testsuite/gas/i386/lfence-ret-d.d
b/gas/testsuite/gas/i386/lfence-ret-d.d
new file mode 100644
index 0000000000..03f8f88fd7
--- /dev/null
+++ b/gas/testsuite/gas/i386/lfence-ret-d.d
@@ -0,0 +1,36 @@
+#source: lfence-ret.s
+#as: -mlfence-before-ret=shl
+#objdump: -dw
+#name: -mlfence-before-ret=shl
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+: 66 c1 24 24 00        shlw   \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c3                retw
+ +[a-f0-9]+: 66 c1 24 24 00        shlw   \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c2 14 00          retw   \$0x14
+ +[a-f0-9]+: c1 24 24 00          shll   \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: c3                    ret
+ +[a-f0-9]+: c1 24 24 00          shll   \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: c2 1e 00              ret    \$0x1e
+ +[a-f0-9]+: 66 c1 24 24 00        shlw   \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 cb                lretw
+ +[a-f0-9]+: 66 c1 24 24 00        shlw   \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 ca 28 00          lretw  \$0x28
+ +[a-f0-9]+: c1 24 24 00          shll   \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: cb                    lret
+ +[a-f0-9]+: c1 24 24 00          shll   \$0x0,\(%esp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ca 28 00              lret   \$0x28
+#pass
diff --git a/gas/testsuite/gas/i386/lfence-ret.s
b/gas/testsuite/gas/i386/lfence-ret.s
index 35c4e6eeaa..f27fa5839e 100644
--- a/gas/testsuite/gas/i386/lfence-ret.s
+++ b/gas/testsuite/gas/i386/lfence-ret.s
@@ -1,4 +1,10 @@
  .text
 _start:
+ retw
+ retw $20
  ret
  ret $30
+ lretw
+ lretw $40
+ lret
+ lret $40
diff --git a/gas/testsuite/gas/i386/x86-64-lfence-load-b.d
b/gas/testsuite/gas/i386/x86-64-lfence-load-b.d
new file mode 100644
index 0000000000..b1fd3cad42
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-lfence-load-b.d
@@ -0,0 +1,137 @@
+#source: x86-64-lfence-load.s
+#as: -mlfence-after-load=general
+#objdump: -dw
+#warning_output: lfence-load-b.e
+#name: x86-64 lfence-load-b
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+: c5 f8 ae 55 00        vldmxcsr 0x0\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 0f 01 55 00          lgdt   0x0\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 0f c7 75 00          vmptrld 0x0\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 0f c7 75 00        vmclear 0x0\(%rbp\)
+ +[a-f0-9]+: 66 0f 38 82 55 00    invpcid 0x0\(%rbp\),%rdx
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 67 0f 01 38          invlpg \(%eax\)
+ +[a-f0-9]+: 0f ae 7d 00          clflush 0x0\(%rbp\)
+ +[a-f0-9]+: 66 0f ae 7d 00        clflushopt 0x0\(%rbp\)
+ +[a-f0-9]+: 66 0f ae 75 00        clwb   0x0\(%rbp\)
+ +[a-f0-9]+: 0f 1c 45 00          cldemote 0x0\(%rbp\)
+ +[a-f0-9]+: f3 0f 1b 4d 00        bndmk  0x0\(%rbp\),%bnd1
+ +[a-f0-9]+: f3 0f 1a 4d 00        bndcl  0x0\(%rbp\),%bnd1
+ +[a-f0-9]+: f2 0f 1a 4d 00        bndcu  0x0\(%rbp\),%bnd1
+ +[a-f0-9]+: f2 0f 1b 4d 00        bndcn  0x0\(%rbp\),%bnd1
+ +[a-f0-9]+: 0f 1b 4d 00          bndstx %bnd1,0x0\(%rbp\)
+ +[a-f0-9]+: 0f 1a 4d 00          bndldx 0x0\(%rbp\),%bnd1
+ +[a-f0-9]+: 0f 18 4d 00          prefetcht0 0x0\(%rbp\)
+ +[a-f0-9]+: 0f 18 55 00          prefetcht1 0x0\(%rbp\)
+ +[a-f0-9]+: 0f 18 5d 00          prefetcht2 0x0\(%rbp\)
+ +[a-f0-9]+: 0f 0d 4d 00          prefetchw 0x0\(%rbp\)
+ +[a-f0-9]+: 0f a1                popq   %fs
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 9d                    popfq
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: d7                    xlat   %ds:\(%rbx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: d9 55 00              fsts   0x0\(%rbp\)
+ +[a-f0-9]+: d9 45 00              flds   0x0\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: db 55 00              fistl  0x0\(%rbp\)
+ +[a-f0-9]+: df 55 00              fists  0x0\(%rbp\)
+ +[a-f0-9]+: db 45 00              fildl  0x0\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: df 45 00              filds  0x0\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 9b dd 75 00          fsave  0x0\(%rbp\)
+ +[a-f0-9]+: dd 65 00              frstor 0x0\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: df 45 00              filds  0x0\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: df 4d 00              fisttps 0x0\(%rbp\)
+ +[a-f0-9]+: d9 65 00              fldenv 0x0\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 9b d9 75 00          fstenv 0x0\(%rbp\)
+ +[a-f0-9]+: d8 45 00              fadds  0x0\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: d8 04 24              fadds  \(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: d8 c3                fadd   %st\(3\),%st
+ +[a-f0-9]+: d8 01                fadds  \(%rcx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: df 01                filds  \(%rcx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: df 11                fists  \(%rcx\)
+ +[a-f0-9]+: 0f ae 29              xrstor \(%rcx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 0f 18 01              prefetchnta \(%rcx\)
+ +[a-f0-9]+: 0f c7 09              cmpxchg8b \(%rcx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 0f c7 09          cmpxchg16b \(%rcx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ff c1                inc    %ecx
+ +[a-f0-9]+: 0f 01 10              lgdt   \(%rax\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 0f 0f 66 02 b0        pfcmpeq 0x2\(%rsi\),%mm4
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 8f 00                popq   \(%rax\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 58                    pop    %rax
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 d1 11              rclw   \(%rcx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f7 01 01 00 00 00    testl  \$0x1,\(%rcx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ff 01                incl   \(%rcx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f7 11                notl   \(%rcx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f7 31                divl   \(%rcx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f7 21                mull   \(%rcx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f7 39                idivl  \(%rcx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f7 29                imull  \(%rcx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 8d 04 40          lea    \(%rax,%rax,2\),%rax
+ +[a-f0-9]+: c9                    leaveq
+ +[a-f0-9]+: 6e                    outsb  %ds:\(%rsi\),\(%dx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ac                    lods   %ds:\(%rsi\),%al
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f3 a5                rep movsl %ds:\(%rsi\),%es:\(%rdi\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: f3 af                repz scas %es:\(%rdi\),%eax
+ +[a-f0-9]+: f3 a7                repz cmpsl %es:\(%rdi\),%ds:\(%rsi\)
+ +[a-f0-9]+: f3 ad                rep lods %ds:\(%rsi\),%eax
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 41 83 03 01          addl   \$0x1,\(%r11\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 41 0f ba 23 01        btl    \$0x1,\(%r11\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 0f c1 03          xadd   %rax,\(%rbx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 0f c1 c3          xadd   %rax,%rbx
+ +[a-f0-9]+: 48 87 03              xchg   %rax,\(%rbx\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 93                xchg   %rax,%rbx
+ +[a-f0-9]+: 48 39 45 40          cmp    %rax,0x40\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 3b 45 40          cmp    0x40\(%rbp\),%rax
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 01 45 40          add    %rax,0x40\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 03 00              add    \(%rax\),%rax
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 85 45 40          test   %rax,0x40\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 85 45 40          test   %rax,0x40\(%rbp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-lfence-load.d
b/gas/testsuite/gas/i386/x86-64-lfence-load.d
index 4f6cd00edf..f21aba85d5 100644
--- a/gas/testsuite/gas/i386/x86-64-lfence-load.d
+++ b/gas/testsuite/gas/i386/x86-64-lfence-load.d
@@ -1,6 +1,7 @@
-#as: -mlfence-after-load=yes
+#as: -mlfence-after-load=all
 #objdump: -dw
-#name: x86-64 -mlfence-after-load=yes
+#warning_output: lfence-load.e
+#name: x86-64 -mlfence-after-load=all

 .*: +file format .*

@@ -15,6 +16,29 @@ Disassembly of section .text:
  +[a-f0-9]+: 0f c7 75 00          vmptrld 0x0\(%rbp\)
  +[a-f0-9]+: 0f ae e8              lfence
  +[a-f0-9]+: 66 0f c7 75 00        vmclear 0x0\(%rbp\)
+ +[a-f0-9]+: 66 0f 38 82 55 00    invpcid 0x0\(%rbp\),%rdx
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 67 0f 01 38          invlpg \(%eax\)
+ +[a-f0-9]+: 0f ae 7d 00          clflush 0x0\(%rbp\)
+ +[a-f0-9]+: 66 0f ae 7d 00        clflushopt 0x0\(%rbp\)
+ +[a-f0-9]+: 66 0f ae 75 00        clwb   0x0\(%rbp\)
+ +[a-f0-9]+: 0f 1c 45 00          cldemote 0x0\(%rbp\)
+ +[a-f0-9]+: f3 0f 1b 4d 00        bndmk  0x0\(%rbp\),%bnd1
+ +[a-f0-9]+: f3 0f 1a 4d 00        bndcl  0x0\(%rbp\),%bnd1
+ +[a-f0-9]+: f2 0f 1a 4d 00        bndcu  0x0\(%rbp\),%bnd1
+ +[a-f0-9]+: f2 0f 1b 4d 00        bndcn  0x0\(%rbp\),%bnd1
+ +[a-f0-9]+: 0f 1b 4d 00          bndstx %bnd1,0x0\(%rbp\)
+ +[a-f0-9]+: 0f 1a 4d 00          bndldx 0x0\(%rbp\),%bnd1
+ +[a-f0-9]+: 0f 18 4d 00          prefetcht0 0x0\(%rbp\)
+ +[a-f0-9]+: 0f 18 55 00          prefetcht1 0x0\(%rbp\)
+ +[a-f0-9]+: 0f 18 5d 00          prefetcht2 0x0\(%rbp\)
+ +[a-f0-9]+: 0f 0d 4d 00          prefetchw 0x0\(%rbp\)
+ +[a-f0-9]+: 0f a1                popq   %fs
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 9d                    popfq
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: d7                    xlat   %ds:\(%rbx\)
+ +[a-f0-9]+: 0f ae e8              lfence
  +[a-f0-9]+: d9 55 00              fsts   0x0\(%rbp\)
  +[a-f0-9]+: d9 45 00              flds   0x0\(%rbp\)
  +[a-f0-9]+: 0f ae e8              lfence
diff --git a/gas/testsuite/gas/i386/x86-64-lfence-load.s
b/gas/testsuite/gas/i386/x86-64-lfence-load.s
index 76d0886617..2a3ac6b7d2 100644
--- a/gas/testsuite/gas/i386/x86-64-lfence-load.s
+++ b/gas/testsuite/gas/i386/x86-64-lfence-load.s
@@ -4,6 +4,25 @@ _start:
  lgdt (%rbp)
  vmptrld (%rbp)
  vmclear (%rbp)
+ invpcid (%rbp), %rdx
+ invlpg (%eax)
+ clflush (%rbp)
+ clflushopt (%rbp)
+ clwb (%rbp)
+ cldemote (%rbp)
+ bndmk (%rbp), %bnd1
+ bndcl (%rbp), %bnd1
+ bndcu (%rbp), %bnd1
+ bndcn (%rbp), %bnd1
+ bndstx %bnd1, (%rbp)
+ bndldx (%rbp), %bnd1
+ prefetcht0 (%rbp)
+ prefetcht1 (%rbp)
+ prefetcht2 (%rbp)
+ prefetchw (%rbp)
+ pop %fs
+ popf
+ xlatb (%rbx)
  fsts (%rbp)
  flds (%rbp)
  fistl (%rbp)
diff --git a/gas/testsuite/gas/i386/x86-64-lfence-ret-a.d
b/gas/testsuite/gas/i386/x86-64-lfence-ret-a.d
index 26e5b48bec..d8e6fa059d 100644
--- a/gas/testsuite/gas/i386/x86-64-lfence-ret-a.d
+++ b/gas/testsuite/gas/i386/x86-64-lfence-ret-a.d
@@ -1,5 +1,6 @@
-#source: lfence-ret.s
+#source: x86-64-lfence-ret.s
 #as: -mlfence-before-ret=or
+#warning_output: x86-64-lfence-ret.e
 #objdump: -dw
 #name: x86-64 -mlfence-before-ret=or

@@ -9,10 +10,40 @@
 Disassembly of section .text:

 0+ <_start>:
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c3                retw
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c2 14 00          retw   \$0x14
  +[a-f0-9]+: 48 83 0c 24 00        orq    \$0x0,\(%rsp\)
  +[a-f0-9]+: 0f ae e8              lfence
  +[a-f0-9]+: c3                    retq
  +[a-f0-9]+: 48 83 0c 24 00        orq    \$0x0,\(%rsp\)
  +[a-f0-9]+: 0f ae e8              lfence
  +[a-f0-9]+: c2 1e 00              retq   \$0x1e
+ +[a-f0-9]+: 48 83 0c 24 00        orq    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 48 c3              data16 rex.W retq
+ +[a-f0-9]+: 48 83 0c 24 00        orq    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 48 c2 28 00        data16 rex.W retq \$0x28
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 cb                lretw
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 ca 28 00          lretw  \$0x28
+ +[a-f0-9]+: 83 0c 24 00          orl    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: cb                    lret
+ +[a-f0-9]+: 83 0c 24 00          orl    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ca 28 00              lret   \$0x28
+ +[a-f0-9]+: 48 83 0c 24 00        orq    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 cb                lretq
+ +[a-f0-9]+: 48 83 0c 24 00        orq    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 ca 28 00          lretq  \$0x28
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-lfence-ret-b.d
b/gas/testsuite/gas/i386/x86-64-lfence-ret-b.d
index 340488831d..e9bb64fe94 100644
--- a/gas/testsuite/gas/i386/x86-64-lfence-ret-b.d
+++ b/gas/testsuite/gas/i386/x86-64-lfence-ret-b.d
@@ -1,5 +1,6 @@
-#source: lfence-ret.s
+#source: x86-64-lfence-ret.s
 #as: -mlfence-before-ret=not
+#warning_output: x86-64-lfence-ret.e
 #objdump: -dw
 #name: x86-64 -mlfence-before-ret=not

@@ -9,6 +10,14 @@
 Disassembly of section .text:

 0+ <_start>:
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%rsp\)
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c3                retw
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%rsp\)
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c2 14 00          retw   \$0x14
  +[a-f0-9]+: 48 f7 14 24          notq   \(%rsp\)
  +[a-f0-9]+: 48 f7 14 24          notq   \(%rsp\)
  +[a-f0-9]+: 0f ae e8              lfence
@@ -17,4 +26,36 @@ Disassembly of section .text:
  +[a-f0-9]+: 48 f7 14 24          notq   \(%rsp\)
  +[a-f0-9]+: 0f ae e8              lfence
  +[a-f0-9]+: c2 1e 00              retq   \$0x1e
+ +[a-f0-9]+: 48 f7 14 24          notq   \(%rsp\)
+ +[a-f0-9]+: 48 f7 14 24          notq   \(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 48 c3              data16 rex.W retq
+ +[a-f0-9]+: 48 f7 14 24          notq   \(%rsp\)
+ +[a-f0-9]+: 48 f7 14 24          notq   \(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 48 c2 28 00        data16 rex.W retq \$0x28
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%rsp\)
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 cb                lretw
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%rsp\)
+ +[a-f0-9]+: 66 f7 14 24          notw   \(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 ca 28 00          lretw  \$0x28
+ +[a-f0-9]+: f7 14 24              notl   \(%rsp\)
+ +[a-f0-9]+: f7 14 24              notl   \(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: cb                    lret
+ +[a-f0-9]+: f7 14 24              notl   \(%rsp\)
+ +[a-f0-9]+: f7 14 24              notl   \(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ca 28 00              lret   \$0x28
+ +[a-f0-9]+: 48 f7 14 24          notq   \(%rsp\)
+ +[a-f0-9]+: 48 f7 14 24          notq   \(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 cb                lretq
+ +[a-f0-9]+: 48 f7 14 24          notq   \(%rsp\)
+ +[a-f0-9]+: 48 f7 14 24          notq   \(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 ca 28 00          lretq  \$0x28
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-lfence-ret-c.d
b/gas/testsuite/gas/i386/x86-64-lfence-ret-c.d
new file mode 100644
index 0000000000..d5027d385f
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-lfence-ret-c.d
@@ -0,0 +1,48 @@
+#source: x86-64-lfence-ret.s
+#as: -mlfence-before-ret=or -mlfence-before-indirect-branch=all
+#warning_output: x86-64-lfence-ret.e
+#objdump: -dw
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c3                retw
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c2 14 00          retw   \$0x14
+ +[a-f0-9]+: 48 83 0c 24 00        orq    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: c3                    retq
+ +[a-f0-9]+: 48 83 0c 24 00        orq    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: c2 1e 00              retq   \$0x1e
+ +[a-f0-9]+: 48 83 0c 24 00        orq    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 48 c3              data16 rex.W retq
+ +[a-f0-9]+: 48 83 0c 24 00        orq    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 48 c2 28 00        data16 rex.W retq \$0x28
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 cb                lretw
+ +[a-f0-9]+: 66 83 0c 24 00        orw    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 ca 28 00          lretw  \$0x28
+ +[a-f0-9]+: 83 0c 24 00          orl    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: cb                    lret
+ +[a-f0-9]+: 83 0c 24 00          orl    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ca 28 00              lret   \$0x28
+ +[a-f0-9]+: 48 83 0c 24 00        orq    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 cb                lretq
+ +[a-f0-9]+: 48 83 0c 24 00        orq    \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 ca 28 00          lretq  \$0x28
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-lfence-ret-d.d
b/gas/testsuite/gas/i386/x86-64-lfence-ret-d.d
new file mode 100644
index 0000000000..533445fee6
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-lfence-ret-d.d
@@ -0,0 +1,49 @@
+#source: x86-64-lfence-ret.s
+#as: -mlfence-before-ret=shl
+#warning_output: x86-64-lfence-ret.e
+#objdump: -dw
+#name: x86-64 -mlfence-before-ret=shl
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+: 66 c1 24 24 00        shlw   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c3                retw
+ +[a-f0-9]+: 66 c1 24 24 00        shlw   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c2 14 00          retw   \$0x14
+ +[a-f0-9]+: 48 c1 24 24 00        shlq   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: c3                    retq
+ +[a-f0-9]+: 48 c1 24 24 00        shlq   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: c2 1e 00              retq   \$0x1e
+ +[a-f0-9]+: 48 c1 24 24 00        shlq   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 48 c3              data16 rex.W retq
+ +[a-f0-9]+: 48 c1 24 24 00        shlq   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 48 c2 28 00        data16 rex.W retq \$0x28
+ +[a-f0-9]+: 66 c1 24 24 00        shlw   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 cb                lretw
+ +[a-f0-9]+: 66 c1 24 24 00        shlw   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 ca 28 00          lretw  \$0x28
+ +[a-f0-9]+: c1 24 24 00          shll   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: cb                    lret
+ +[a-f0-9]+: c1 24 24 00          shll   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ca 28 00              lret   \$0x28
+ +[a-f0-9]+: 48 c1 24 24 00        shlq   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 cb                lretq
+ +[a-f0-9]+: 48 c1 24 24 00        shlq   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 ca 28 00          lretq  \$0x28
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-lfence-ret-e.d
b/gas/testsuite/gas/i386/x86-64-lfence-ret-e.d
new file mode 100644
index 0000000000..646b352a62
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-lfence-ret-e.d
@@ -0,0 +1,49 @@
+#source: x86-64-lfence-ret.s
+#as: -mlfence-before-ret=shl
+#warning_output: x86-64-lfence-ret.e
+#objdump: -dw
+#name: x86-64 -mlfence-before-ret=yes
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+ +[a-f0-9]+: 66 c1 24 24 00        shlw   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c3                retw
+ +[a-f0-9]+: 66 c1 24 24 00        shlw   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 c2 14 00          retw   \$0x14
+ +[a-f0-9]+: 48 c1 24 24 00        shlq   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: c3                    retq
+ +[a-f0-9]+: 48 c1 24 24 00        shlq   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: c2 1e 00              retq   \$0x1e
+ +[a-f0-9]+: 48 c1 24 24 00        shlq   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 48 c3              data16 rex.W retq
+ +[a-f0-9]+: 48 c1 24 24 00        shlq   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 48 c2 28 00        data16 rex.W retq \$0x28
+ +[a-f0-9]+: 66 c1 24 24 00        shlw   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 cb                lretw
+ +[a-f0-9]+: 66 c1 24 24 00        shlw   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 66 ca 28 00          lretw  \$0x28
+ +[a-f0-9]+: c1 24 24 00          shll   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: cb                    lret
+ +[a-f0-9]+: c1 24 24 00          shll   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: ca 28 00              lret   \$0x28
+ +[a-f0-9]+: 48 c1 24 24 00        shlq   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 cb                lretq
+ +[a-f0-9]+: 48 c1 24 24 00        shlq   \$0x0,\(%rsp\)
+ +[a-f0-9]+: 0f ae e8              lfence
+ +[a-f0-9]+: 48 ca 28 00          lretq  \$0x28
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-lfence-ret.e
b/gas/testsuite/gas/i386/x86-64-lfence-ret.e
new file mode 100644
index 0000000000..13730e50e6
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-lfence-ret.e
@@ -0,0 +1,3 @@
+.*: Assembler messages:
+.*:??: Warning: no instruction mnemonic suffix given and no register
operands; using default for `lret'
+.*:??: Warning: no instruction mnemonic suffix given and no register
operands; using default for `lret'
diff --git a/gas/testsuite/gas/i386/x86-64-lfence-ret.s
b/gas/testsuite/gas/i386/x86-64-lfence-ret.s
new file mode 100644
index 0000000000..986239c222
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-lfence-ret.s
@@ -0,0 +1,14 @@
+ .text
+_start:
+ retw
+ retw $20
+ ret
+ ret $30
+ data16 rex.w ret
+ data16 rex.w ret $40
+ lretw
+ lretw $40
+ lret
+ lret $40
+ lretq
+ lretq $40
-- 
2.18.1
> > +      if (lfence_before_ret == lfence_before_ret_not)
> >   {
> > -   p = frag_more ((flag_code == CODE_64BIT ? 2 : 0) + 6 + 3);
> > -   /* notl: 0xf71424.  */
> > -   if (flag_code == CODE_64BIT)
> > -     *p++ = 0x48;
> > +   /* notl: 0xf71424, may add prefix
> > +      for operand size overwrite or 64-bit code.  */
>
> Despite the comment extension you still say "notl". Please switch
> toi either just "not" or something like "not{w,l,q}". Also
> s/overwrite/override/. Note how you ...
>
> > +   p = frag_more ((prefix ? 2 : 0) + 6 + 3);
> > +   if (prefix)
> > +     *p++ = prefix;
> >     *p++ = 0xf7;
> >     *p++ = 0x14;
> >     *p++ = 0x24;
> > -   /* notl: 0xf71424.  */
> > -   if (flag_code == CODE_64BIT)
> > -     *p++ = 0x48;
> > +   if (prefix)
> > +     *p++ = prefix;
> >     *p++ = 0xf7;
> >     *p++ = 0x14;
> >     *p++ = 0x24;
> >   }
> > +      else
> > + {
> > +   p = frag_more ((prefix ? 1 : 0) + 4 + 3);
> > +   if (prefix)
> > +     *p++ = prefix;
> > +   if (lfence_before_ret == lfence_before_ret_or)
> > +     {
> > +       /* orl: 0x830c2400, may add prefix
> > + for operand size overwrite or 64-bit code.  */
>
> ... also have the same (bogus) suffixe here, but ...
>
> > +       *p++ = 0x83;
> > +       *p++ = 0x0c;
> > +     }
> > +   else
> > +     {
> > +       /* shl: 0xc1242400, may add prefix
> > + for operand size overwrite or 64-bit code.  */
>
> ... not here.
>
> Jan



--
BR,
Hongtao


More information about the Binutils mailing list