Bug 18638

Summary: Wrong code generated for VMOVQ and VGATHER... on x86 CPU
Product: binutils Reporter: Michael Rolle <m>
Component: gasAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED WORKSFORME    
Severity: critical CC: hjl.tools
Priority: P2    
Version: 2.26   
Target Milestone: 2.26   
Host: Target:
Build: Last reconfirmed:

Description Michael Rolle 2015-07-08 02:32:03 UTC
Two issues here, both with the binary code generated by gas.  These are seen in the testsuite files.

(1) vmovq rcx, xmm4

Generates C4 E1 FD 7E E1.
The VEX.L bit (the 8's bit of the third byte) is 1.  However, AMD spec says you
must have VEX.L = 0, and in fact, VEX.L = 1 causes a #UD exception.
Intel document says the same thing.

(2) vgather...

Generates a ModR/M byte with the r/m field = 000b.  However, Intel spec says
a #UD will "cause a #UD if the memory operand is encoded without the SIB byte".  I interpret this to mean that the r/m field must be 100b.  AMD spec says specifically that there is a #UD if "MODRM.rm != 100b".

I've also filed an issue suggesting that the assembled code from the testsuite be actually executed on the appropriate CPU.
Comment 1 H.J. Lu 2015-07-09 19:20:03 UTC
(In reply to Michael Rolle from comment #0)
> Two issues here, both with the binary code generated by gas.  These are seen
> in the testsuite files.
> 
> (1) vmovq rcx, xmm4
> 
> Generates C4 E1 FD 7E E1.
> The VEX.L bit (the 8's bit of the third byte) is 1.  However, AMD spec says
> you
> must have VEX.L = 0, and in fact, VEX.L = 1 causes a #UD exception.
> Intel document says the same thing.

With binutils master branch, I got

[hjl@gnu-6 tmp]$ gcc -c a.s    
[hjl@gnu-6 tmp]$ objdump -dw a.o

a.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <.text>:
   0:	c4 e1 f9 7e e1       	vmovq  %xmm4,%rcx
[hjl@gnu-6 tmp]$ 

It is F9, not FD.

> (2) vgather...
> 
> Generates a ModR/M byte with the r/m field = 000b.  However, Intel spec says
> a #UD will "cause a #UD if the memory operand is encoded without the SIB
> byte".  I interpret this to mean that the r/m field must be 100b.  AMD spec
> says specifically that there is a #UD if "MODRM.rm != 100b".

Do you have a testcase?
Comment 2 H.J. Lu 2015-07-14 16:29:47 UTC
Works for me.
Comment 3 Michael Rolle 2015-07-21 05:20:35 UTC
In the second case, it was my mistake.  I was reading the reg field as the rm field.  The rm field is in fact 100b.  So there's no bug.
Comment 4 Michael Rolle 2015-07-21 05:42:12 UTC
In the first case, I got the correct results when I assembled the .s file.  The file was  gas/i386/x86-64-avx-scalar.s in the gas/testsuite directory.  objdump of the resulting .o file show the correct bytes.

Earlier, I was looking at a directory of .o files that I made a while back, running the testsuite and saving the .o files.  I don't recall how I generated these.  However, looking at the regression file  gas/i386/x86-64-avx-scalar.d, this file has the lines:

[       ]*[a-f0-9]+:    c4 e1 fd 7e e1          vmovq  %xmm4,%rcx
[       ]*[a-f0-9]+:    c4 e1 fd 6e e1          vmovq  %rcx,%xmm4
[       ]*[a-f0-9]+:    c4 e1 fd 7e e1          vmovq  %xmm4,%rcx
[       ]*[a-f0-9]+:    c4 e1 fd 6e e1          vmovq  %rcx,%xmm4

I must have gotten some older version of the testsuite that runs an older version of gas, which you can see does have the bug I reported.  If it helps you figure out which version it is, the dates on the files are all dated 2 July 2014, and I am using Cygwin as the platform.

When I run as directly, the version is:
GNU assembler version 2.24.51 (x86_64-pc-cygwin) using BFD version (GNU Binutils) 2.24.51.20140703

If you have the time, maybe you could figure out how come the testsuite that I have is wrong and correct the problem.