Bug 4834 - Incorrect bytemode in x86 disassembler
Summary: Incorrect bytemode in x86 disassembler
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: binutils (show other bugs)
Version: 2.18
: P2 normal
Target Milestone: ---
Assignee: H.J. Lu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-07-24 01:03 UTC by Takashi Hattori
Modified: 2007-07-30 07:07 UTC (History)
2 users (show)

See Also:
Host: i686-apple-darwin
Target:
Build:
Last reconfirmed:


Attachments
A patch (969 bytes, patch)
2007-07-28 17:57 UTC, H.J. Lu
Details | Diff
A patch for SSE4 (473 bytes, patch)
2007-07-29 18:51 UTC, H.J. Lu
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Takashi Hattori 2007-07-24 01:03:37 UTC
Disassembler prints incorrect bytemode for SSE insns in Intel syntax.

% objdump -v
GNU objdump (GNU Binutils) 2.17.50.20070723
Copyright 2007 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.

% cat sse.s
.byte  0xF2, 0x0F, 0x5E, 0x00
.byte  0xF3, 0x0F, 0x5E, 0x00
.byte  0xF2, 0x0F, 0x5F, 0x00
.byte  0xF3, 0x0F, 0x5F, 0x00
.byte  0xF2, 0x0F, 0x5D, 0x00
.byte  0xF3, 0x0F, 0x5D, 0x00
.byte  0xF3, 0x0F, 0x10, 0x00
.byte  0xF3, 0x0F, 0x11, 0x00
.byte  0xF2, 0x0F, 0x10, 0x00
.byte  0xF2, 0x0F, 0x11, 0x00
.byte  0xF2, 0x0F, 0x59, 0x00
.byte  0xF3, 0x0F, 0x59, 0x00
.byte  0xF3, 0x0F, 0x53, 0x00
.byte  0xF3, 0x0F, 0x52, 0x00
.byte  0xF2, 0x0F, 0x51, 0x00
.byte  0xF3, 0x0F, 0x51, 0x00
.byte  0xF2, 0x0F, 0x5C, 0x00
.byte  0xF3, 0x0F, 0x5C, 0x00
.byte  0xF3, 0x0F, 0x5A, 0x00
.byte  0xF2, 0x0F, 0x5A, 0x00

% as sse.s -o sse.o
% objdump -dw -Mintel --section="LC_SEGMENT.__TEXT.__text" sse.o

sse.o:     file format mach-o-le

Disassembly of section LC_SEGMENT.__TEXT.__text:

0000000000000000 <LC_SEGMENT.__TEXT.__text>:
   0:   f2 0f 5e 00             divsd  xmm0,XMMWORD PTR [eax]
   4:   f3 0f 5e 00             divss  xmm0,XMMWORD PTR [eax]
   8:   f2 0f 5f 00             maxsd  xmm0,XMMWORD PTR [eax]
   c:   f3 0f 5f 00             maxss  xmm0,XMMWORD PTR [eax]
  10:   f2 0f 5d 00             minsd  xmm0,XMMWORD PTR [eax]
  14:   f3 0f 5d 00             minss  xmm0,XMMWORD PTR [eax]
  18:   f3 0f 10 00             movss  xmm0,XMMWORD PTR [eax]
  1c:   f3 0f 11 00             movss  XMMWORD PTR [eax],xmm0
  20:   f2 0f 10 00             movsd  xmm0,XMMWORD PTR [eax]
  24:   f2 0f 11 00             movsd  XMMWORD PTR [eax],xmm0
  28:   f2 0f 59 00             mulsd  xmm0,XMMWORD PTR [eax]
  2c:   f3 0f 59 00             mulss  xmm0,XMMWORD PTR [eax]
  30:   f3 0f 53 00             rcpss  xmm0,XMMWORD PTR [eax]
  34:   f3 0f 52 00             rsqrtss xmm0,XMMWORD PTR [eax]
  38:   f2 0f 51 00             sqrtsd xmm0,XMMWORD PTR [eax]
  3c:   f3 0f 51 00             sqrtss xmm0,XMMWORD PTR [eax]
  40:   f2 0f 5c 00             subsd  xmm0,XMMWORD PTR [eax]
  44:   f3 0f 5c 00             subss  xmm0,XMMWORD PTR [eax]
  48:   f3 0f 5a 00             cvtss2sd xmm0,XMMWORD PTR [eax]
  4c:   f2 0f 5a 00             cvtsd2ss xmm0,XMMWORD PTR [eax]

All above modes are XMMWORD, but that's wrong.
Intel(R) 64 and IA-32 Architectures Software Developer's Manual Volume 2A & 2B
In the manuals, it's written as follows.

Opcode                  Instruction
F2 0F 5E /r             DIVSD  xmm1, xmm2/m64
F3 0F 5E /r             DIVSS  xmm1, xmm2/m32
F2 0F 5F /r             MAXSD  xmm1, xmm2/m64
F3 0F 5F /r             MAXSS  xmm1, xmm2/m32
F2 0F 5D /r             MINSD  xmm1, xmm2/m64
F3 0F 5D /r             MINSS  xmm1, xmm2/m32
F3 0F 10 /r             MOVSS  xmm1, xmm2/m32
F3 0F 11 /r             MOVSS  xmm2/m32, xmm
F2 0F 10 /r             MOVSD  xmm1, xmm2/m64
F2 0F 11 /r             MOVSD  xmm2/m64, xmm1
F2 0F 59 /r             MULSD  xmm1, xmm2/m64
F3 0F 59 /r             MULSS  xmm1, xmm2/m32
F3 0F 53 /r             RCPSS  xmm1, xmm2/m32
F3 0F 52 /r             RSQRTSS xmm1, xmm2/m32
F2 0F 51 /r             SQRTSD xmm1, xmm2/m64
F3 0F 51 /r             SQRTSS xmm1, xmm2/m32
F2 0F 5C /r             SUBSD  xmm1, xmm2/m64
F3 0F 5C /r             SUBSS  xmm1, xmm2/m32
F3 0F 5A /r             CVTSS2SD xmm1, xmm2/m32
F2 0F 5A /r             CVTSD2SS xmm1, xmm2/m64
Comment 1 Takashi Hattori 2007-07-24 04:37:41 UTC
Some more bytemode issues.

% cat cvt.s
.text
.byte  0xF2, 0x0F, 0xC2, 0x00, 0x00
.byte  0xF3, 0x0F, 0xC2, 0x00, 0x00
.byte  0x66, 0x0F, 0x2A, 0x00
.byte  0x0F, 0x2D, 0x00
.byte  0xF2, 0x0F, 0x2D, 0x00
.byte  0x0F, 0x2C, 0x00
.byte  0xF2, 0x0F, 0x2C, 0x00
.byte  0xF3, 0x0F, 0x2C, 0x00

% as cvt.s -o cvt.o
% objdump -dw -Mintel --section="LC_SEGMENT.__TEXT.__text" cvt.o
cvt.o:     file format mach-o-le

Disassembly of section LC_SEGMENT.__TEXT.__text:

0000000000000000 <LC_SEGMENT.__TEXT.__text>:
   0:   f2 0f c2 00 00          cmpeqsd xmm0,XMMWORD PTR [eax]
   5:   f3 0f c2 00 00          cmpeqss xmm0,XMMWORD PTR [eax]
   a:   66 0f 2a 00             cvtpi2pd xmm0,XMMWORD PTR [eax]
   e:   0f 2d 00                cvtps2pi mm0,XMMWORD PTR [eax]
  11:   f2 0f 2d 00             cvtsd2si eax,XMMWORD PTR [eax]
  15:   0f 2c 00                cvttps2pi mm0,XMMWORD PTR [eax]
  18:   f2 0f 2c 00             cvttsd2si eax,XMMWORD PTR [eax]
  1c:   f3 0f 2c 00             cvttss2si eax,XMMWORD PTR [eax]

Intel manual says:

Opcode                  Instruction
F2 0F C2 /r ib          CMPSD xmm1, xmm2/m64, imm8
F3 0F C2 /r ib          CMPSS xmm1, xmm2/m32, imm8
66 0F 2A /r             CVTPI2PD xmm, mm/m64
0F 2D /r                CVTPS2PI mm, xmm/m64
F2 0F 2D /r             CVTSD2SI r32, xmm/m64
0F 2C /r                CVTTPS2PI mm, xmm/m64
F2 0F 2C /r             CVTTSD2SI r32, xmm/m64
F3 0F 2C /r             CVTTSS2SI r32, xmm/m32
Comment 2 H.J. Lu 2007-07-28 17:57:07 UTC
Created attachment 1936 [details]
A patch

Could you please try this patch?
Comment 3 Takashi Hattori 2007-07-28 23:21:18 UTC
(In reply to comment #2)
> Created an attachment (id=1936)
> A patch
> 
> Could you please try this patch?

Applied your patch and tesed. All the cases I reported is now correct.

However, I found some more wrong mode insns.
It seems that EX was replaced with EXx.
EX depended on prefix to determine bytemode, whereas EXx forces XMMWORD.
I checked insns that take EXx.

% cat sse4.s
.text
.byte  0x66, 0x0f, 0x38, 0x20, 0x00
.byte  0x66, 0x0f, 0x38, 0x21, 0x00
.byte  0x66, 0x0f, 0x38, 0x22, 0x00
.byte  0x66, 0x0f, 0x38, 0x23, 0x00
.byte  0x66, 0x0f, 0x38, 0x24, 0x00
.byte  0x66, 0x0f, 0x38, 0x25, 0x00
.byte  0x66, 0x0f, 0x38, 0x30, 0x00
.byte  0x66, 0x0f, 0x38, 0x31, 0x00
.byte  0x66, 0x0f, 0x38, 0x32, 0x00
.byte  0x66, 0x0f, 0x38, 0x33, 0x00
.byte  0x66, 0x0f, 0x38, 0x34, 0x00
.byte  0x66, 0x0f, 0x38, 0x35, 0x00
.byte  0x66, 0x0F, 0x3A, 0x21, 0x00, 0x00


% objdump -dw -Mintel --section="LC_SEGMENT.__TEXT.__text" sse4.o

sse4.o:     file format mach-o-le

Disassembly of section LC_SEGMENT.__TEXT.__text:

0000000000000000 <LC_SEGMENT.__TEXT.__text>:
   0:   66 0f 38 20 00          pmovsxbw xmm0,XMMWORD PTR [eax]
   5:   66 0f 38 21 00          pmovsxbd xmm0,XMMWORD PTR [eax]
   a:   66 0f 38 22 00          pmovsxbq xmm0,XMMWORD PTR [eax]
   f:   66 0f 38 23 00          pmovsxwd xmm0,XMMWORD PTR [eax]
  14:   66 0f 38 24 00          pmovsxwq xmm0,XMMWORD PTR [eax]
  19:   66 0f 38 25 00          pmovsxdq xmm0,XMMWORD PTR [eax]
  1e:   66 0f 38 30 00          pmovzxbw xmm0,XMMWORD PTR [eax]
  23:   66 0f 38 31 00          pmovzxbd xmm0,XMMWORD PTR [eax]
  28:   66 0f 38 32 00          pmovzxbq xmm0,XMMWORD PTR [eax]
  2d:   66 0f 38 33 00          pmovzxwd xmm0,XMMWORD PTR [eax]
  32:   66 0f 38 34 00          pmovzxwq xmm0,XMMWORD PTR [eax]
  37:   66 0f 38 35 00          pmovzxdq xmm0,XMMWORD PTR [eax]
  3c:   66 0f 3a 21 00 00       insertps xmm0,XMMWORD PTR [eax],0x0


Intel SSE4 Programming Reference.pdf writes:

Opcode                  Instruction
66 0f 38 20 /r          PMOVSXBW xmm1, xmm2/m64
66 0f 38 21 /r          PMOVSXBD xmm1, xmm2/m32
66 0f 38 22 /r          PMOVSXBQ xmm1, xmm2/m16
66 0f 38 23 /r          PMOVSXWD xmm1, xmm2/m64
66 0f 38 24 /r          PMOVSXWQ xmm1, xmm2/m32
66 0f 38 25 /r          PMOVSXDQ xmm1, xmm2/m64
66 0f 38 30 /r          PMOVZXBW xmm1, xmm2/m64
66 0f 38 31 /r          PMOVZXBD xmm1, xmm2/m32
66 0f 38 32 /r          PMOVZXBQ xmm1, xmm2/m16
66 0f 38 33 /r          PMOVZXWD xmm1, xmm2/m64
66 0f 38 34 /r          PMOVZXWQ xmm1, xmm2/m32
66 0f 38 35 /r          PMOVZXDQ xmm1, xmm2/m64
66 0F 3A 21 /r ib       INSERTPS xmm1, xmm2/m32, imm8
Comment 4 H.J. Lu 2007-07-29 18:51:51 UTC
Created attachment 1937 [details]
A patch for SSE4

Can you try this patch?
Comment 6 Takashi Hattori 2007-07-30 07:07:31 UTC
(In reply to comment #4)
> Created an attachment (id=1937)
> A patch for SSE4
> 
> Can you try this patch?

Already checked in but I tried and confirmed that all is OK.
Thanks.