Bug 6957 - i386 NOPs must be derived from march not mtune
Summary: i386 NOPs must be derived from march not mtune
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: gas (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: unassigned
URL:
Keywords:
: 14066 (view as bug list)
Depends on:
Blocks:
 
Reported: 2008-10-11 14:25 IST by Alan Swanson
Modified: 2012-05-16 02:42 IST (History)
5 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Last reconfirmed: 2010-06-21 14:48:07


Attachments
binutils-cvs-fix-i386-nop.patch (1.03 KB, patch)
2008-10-11 14:28 IST, Alan Swanson
Details | Diff
binutils-cvs-fix2-i386-nop.patch (1.06 KB, patch)
2008-10-12 10:20 IST, Alan Swanson
Details | Diff
testcase.s (3.37 KB, text/plain)
2010-06-08 21:49 IST, Daniel Drake
Details
testcase2.s (588 bytes, text/plain)
2010-06-08 22:04 IST, Daniel Drake
Details
testcase3.s (53 bytes, text/plain)
2010-06-08 22:07 IST, Daniel Drake
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alan Swanson 2008-10-11 14:25:04 IST
As recently discussed on the Linux kernel mailing list, in gas the semantics for
choosing the i386 NOP sequence are incorrect and will cause unexpected failures.
They should chosen depending on march where new instructions can be used and not
mtune where new instructions must not be used.

http://thread.gmane.org/gmane.linux.kernel/730834/focus=731161

There is also another related issue with i686 class CPUs. Some do not actually
support the NOPL instruction. Examples include Via C3, Via Eden, AMD Geode LX
(as used in OLPC), Transmeta Crusoe and it appears broken on Virtual PC.
Changing i686 to use f32_patt instead alt_long_patt allows binaries created with
-march=i686 to run on all i686 class CPUs with only a minor optimization loss
versus the gain from CMOV optimisation.

The attached patch corrects both these issues.
Comment 1 Alan Swanson 2008-10-11 14:28:27 IST
Created attachment 2995 [details]
binutils-cvs-fix-i386-nop.patch
Comment 2 H.J. Lu 2008-10-11 23:47:18 IST
Arch may be changed by the .arch directive. However, when we
generate NOPs, we lost the .arch information. We need to know
the arch for NOP, which may not be necessarily the same as the
one specified by -march=.
Comment 3 Alan Swanson 2008-10-12 09:37:12 IST
But currently the only arch flags checked in i386_align_code was cpui686 which
is where we should disable use of NOPL for mentioned reasons.
Comment 4 H.J. Lu 2008-10-12 09:49:29 IST
(In reply to comment #3)
> But currently the only arch flags checked in i386_align_code was cpui686 which
> is where we should disable use of NOPL for mentioned reasons.

If you want to run the binary on CPU which doesn't support all i686
instructions, you shouldn't pass -march=i686 to assembler.
Comment 5 H.J. Lu 2008-10-12 09:52:42 IST
By default, assembler assumes CPU takes all instructions supported
by assembler. You can use -march=XXX to limit instructions to only
those supported by XXX.
Comment 6 Alan Swanson 2008-10-12 10:20:07 IST
Created attachment 2997 [details]
binutils-cvs-fix2-i386-nop.patch

Add bitfield checking back to patch for .arch settings.

First point, do you not first agree that choosing instructions based on mtune
is seriously broken?

The other point here is about compatibility. Yes, these CPUs are missing this
one instruction but since most builds these days are i686 by default this will
break on these systems for users. It's only been fortunate that GCC strips out
mtune options to gas. The optimisation lost is minor.

I'm obviously an unknown person but the link given shows Linus Torvalds opinion
on this which may or not mean much to you.
Comment 7 H.J. Lu 2008-10-12 12:44:02 IST
(In reply to comment #6)
> Created an attachment (id=2997)
> binutils-cvs-fix2-i386-nop.patch
> 
> Add bitfield checking back to patch for .arch settings.
> 
> First point, do you not first agree that choosing instructions based on mtune
> is seriously broken?
> 
> The other point here is about compatibility. Yes, these CPUs are missing this
> one instruction but since most builds these days are i686 by default this will
> break on these systems for users. It's only been fortunate that GCC strips out
> mtune options to gas. The optimisation lost is minor.
> 
> I'm obviously an unknown person but the link given shows Linus Torvalds opinion
> on this which may or not mean much to you. 

For assembler, PROCESSOR_UNKNOWN means CPU will take any instructions,
including those new NOPs. If you want to limit NOPs to certain ISA,
XXX, you need to use -match=XXX/-mtune=XXX. I checked in a patch

http://sourceware.org/ml/binutils/2008-10/msg00124.html

Now you can use the .arch directive to limit NOPs to XXX.
Comment 8 H.J. Lu 2008-10-12 12:45:49 IST
(In reply to comment #6)
> Created an attachment (id=2997)
> binutils-cvs-fix2-i386-nop.patch
> 
> Add bitfield checking back to patch for .arch settings.
> 
> First point, do you not first agree that choosing instructions based on mtune
> is seriously broken?
> 

I agree. If it is true, assembler is broken. If you can show me
a small testcase, I will fix it.

Comment 9 Alan Swanson 2008-10-12 13:25:52 IST
Sorry but how can PROCESSOR_UNKNOWN mean it can run any instructions? It's
unknown! Therefore we must use lowest common denominator to expect it to work on
all CPUs.

Tune means scheduling.
Arch means instructions.

Test case. Pass -march=i486 and -mtune=i686 and it'll use alt_long_patt which
won't work on i486. Fail.
Comment 10 H.J. Lu 2008-10-12 21:30:40 IST
(In reply to comment #9)
> Sorry but how can PROCESSOR_UNKNOWN mean it can run any instructions? It's
> unknown! Therefore we must use lowest common denominator to expect it to work on
> all CPUs.

If we do that, x86 assembler won't take any SSE instructions by default.
It will break gcc.

> Tune means scheduling.
> Arch means instructions.

That is true.

> 
> Test case. Pass -march=i486 and -mtune=i686 and it'll use alt_long_patt which
> won't work on i486. Fail.

Where is the testcase?
Comment 11 H.J. Lu 2008-10-12 21:35:10 IST
This testcase works for me:

[hjl@gnu-6 tmp]$ cat nops.s
        .text
nop:
        movsbl %al,%esi
        .p2align 4
[hjl@gnu-6 tmp]$ as --32 -o nops.o nops.s -march=i486 -mtune=i686
[hjl@gnu-6 tmp]$ objdump -dw nops.o

nops.o:     file format elf32-i386


Disassembly of section .text:

00000000 <nop>:
   0:   0f be f0                movsbl %al,%esi
   3:   8d b6 00 00 00 00       lea    0x0(%esi),%esi
   9:   8d bc 27 00 00 00 00    lea    0x0(%edi,%eiz,1),%edi
[hjl@gnu-6 tmp]$ as --32 -o nops.o nops.s -march=i686 -mtune=i686
[hjl@gnu-6 tmp]$ objdump -dw nops.o

nops.o:     file format elf32-i386


Disassembly of section .text:

00000000 <nop>:
   0:   0f be f0                movsbl %al,%esi
   3:   66 66 66 66 2e 0f 1f 84 00 00 00 00 00  nopw   %cs:0x0(%eax,%eax,1)
[hjl@gnu-6 tmp]$
Comment 12 Daniel Drake 2010-06-08 21:49:11 IST
Created attachment 4835 [details]
testcase.s
Comment 13 Daniel Drake 2010-06-08 21:53:18 IST
(In reply to comment #8)
> I agree. If it is true, assembler is broken. If you can show me
> a small testcase, I will fix it.

I can confirm the bug and I'd like to take you up on that offer, but I may have
fallen short of the *small* test case :)

I attached testcase.s which is some assembly caught from a gcc invokation inside
glibc's compile process. It's 1300 lines (sorry!) but does confirm the usage of
the nopl instruction when -mtune=i686 is passed :

$ as -Qy -o foo testcase.s
$ objdump -d foo | grep nopl
$ as -Qy -mtune=i686 -o foo testcase.s
$ objdump -d foo | grep nopl
  3d:	0f 1f 00             	nopl   (%eax)

Reproduced on both:
GNU assembler version 2.19.51.0.14-37.fc12 20090722
GNU assembler version 2.20.51.0.2-17.fc14 20091009

Someone more savvy with assembly code could probably cut down my test case into
something more trivial with a small amount of effort - I'm a bit out of my
comfort zone.
Comment 14 H.J. Lu 2010-06-08 21:58:54 IST
(In reply to comment #12)
> Created an attachment (id=4835)
> testcase.s
> 

Works for me:

[hjl@gnu-6 tmp]$ as -Qy -mtune=i686 -o foo testcase.s -march=i486 --32
[hjl@gnu-6 tmp]$ objdump -d foo | grep nopl
[hjl@gnu-6 tmp]$ 

You need -march=i486 to disable instructions.
Comment 15 Daniel Drake 2010-06-08 22:04:47 IST
Created attachment 4836 [details]
testcase2.s

Duh. The previous testcase.s was generated using "gcc -g" so its naturally
huge. Taking off the -g results in a much smaller (95 line) test case which
shows the issue in an identical manner.
Comment 16 Daniel Drake 2010-06-08 22:07:03 IST
Created attachment 4837 [details]
testcase3.s

One more. By deleting (more or less) random lines from testcase2.s I came up
with a 9-line test case which produces 2 nopl instructions when assembled with
-mtune=i686.

$ as -mtune=i686 -o foo testcase3.s
$ objdump -d foo
Disassembly of section .text:

00000000 <.text>:
   0:	55			push   %ebp
   1:	5d			pop    %ebp
   2:	c3			ret    
   3:	0f 1f 44 00 00		nopl   0x0(%eax,%eax,1)
   8:	c3			ret    
   9:	0f 1f 80 00 00 00 00	nopl   0x0(%eax)
Comment 17 Daniel Drake 2010-06-08 22:17:06 IST
Oops, sorry for the test case spam. I did not see your comment.

I understand now.

"as -mtune=i686" basically works out to be "as -march=i686 -mtune=i686"

Hence generating i686-specific instructions is acceptable/expected.

My confusion stemmed from the fact that this is a minor behavioural difference
from the equivalently named parameters to gcc. (maybe this is why gcc does not
pass those parameters down to the assembler by default)
Comment 18 Andreas Schwab 2010-06-21 11:08:19 IST
That does not work for me.

$ cat nopl.s
 cmove %eax,%edx
 .align 8
$ as --32 -mtune=i686 nopl.s 
$ objdump -dr | grep nopl
   3:   0f 1f 44 00 00          nopl   0x0(%eax,%eax,1)
$ as --32 -mtune=i686 -march=i486 nopl.s 
nopl.s: Assembler messages:
nopl.s:1: Error: `cmove' is not supported on `i486'

There should be a way to set -mtune while letting -march alone.
Comment 19 H.J. Lu 2010-06-21 14:28:42 IST
(In reply to comment #18)
> That does not work for me.
> 
> $ cat nopl.s
>  cmove %eax,%edx
>  .align 8
> $ as --32 -mtune=i686 nopl.s 
> $ objdump -dr | grep nopl
>    3:   0f 1f 44 00 00          nopl   0x0(%eax,%eax,1)
> $ as --32 -mtune=i686 -march=i486 nopl.s 
> nopl.s: Assembler messages:
> nopl.s:1: Error: `cmove' is not supported on `i486'
> 
> There should be a way to set -mtune while letting -march alone.

Please state exactly what you have in mind. From

$ cat nopl.s
 cmove %eax,%edx
 .align 8

you want i686 instructions, but you don't want to
tune for i686. Is that correct?
Comment 20 Andreas Schwab 2010-06-21 14:48:07 IST
I want that the assembler does not sneak in nopl.
Comment 21 H.J. Lu 2010-06-21 15:53:42 IST
(In reply to comment #20)
> I want that the assembler does not sneak in nopl.

Then you should use -mtune=i586 or an option not to
tune for i686.
Comment 22 Andreas Schwab 2010-06-21 16:01:44 IST
But the docs say that -mtune only uses non-generic insns when -march is also
given.
Comment 23 H.J. Lu 2010-06-21 16:03:54 IST
-mtune uses all available instructions. By default, all instructions
are enabled.
Comment 24 Andreas Schwab 2010-06-21 16:37:50 IST
nopl is not always available on i686.
Comment 25 H.J. Lu 2010-06-23 14:36:50 IST
From Intel IA32/Intel64 SDM:

The multi-byte form of NOP is available on processors with model encoding:
• CPUID.01H.EAX[Bytes 11:8] = 0110B or 1111B

Intel i686 has 111B.
Comment 26 Andreas Schwab 2010-07-01 15:27:16 IST
What does that say about other vender's models?
Comment 28 Andreas Schwab 2011-02-08 17:02:41 IST
Still broken.

$ cat nopl.s
 nop
 .align 8
$ gas/as-new --32 -mtune=i686 nopl.s
$ objdump -d

a.out:     file format elf32-i386


Disassembly of section .text:

00000000 <.text>:
   0:   90                      nop
   1:   0f 1f 80 00 00 00 00    nopl   0x0(%eax)
Comment 29 Jan Kratochvil 2011-02-08 17:12:26 IST
Fix:
[rfc] nopl should not be output on -mtune=i686
http://sourceware.org/ml/binutils/2011-02/msg00070.html
Comment 30 cvs-commit@gcc.gnu.org 2011-02-08 20:21:30 IST
CVSROOT:	/cvs/src
Module name:	src
Changes by:	hjl@sourceware.org	2011-02-08 20:21:26

Modified files:
	gas            : ChangeLog 
	gas/config     : tc-i386.c 
	gas/testsuite  : ChangeLog 
	gas/testsuite/gas/i386: nops-1-i686.d nops-3-i686.d 
	                        nops-4-i686.d 

Log message:
	Use f32_patt in i386_align_code  when tuning for i686.
	
	gas/
	
	2011-02-08  H.J. Lu  <hongjiu.lu@intel.com>
	
	PR gas/6957
	* config/tc-i386.c (i386_align_code): Use f32_patt when tuning
	for i686.
	
	gas/testsuite/
	
	2011-02-08  H.J. Lu  <hongjiu.lu@intel.com>
	
	PR gas/6957
	* gas/i386/nops-1-i686.d: Updated.
	* gas/i386/nops-3-i686.d: Likewise.
	* gas/i386/nops-4-i686.d: Likewise.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/ChangeLog.diff?cvsroot=src&r1=1.4399&r2=1.4400
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/config/tc-i386.c.diff?cvsroot=src&r1=1.462&r2=1.463
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/testsuite/ChangeLog.diff?cvsroot=src&r1=1.1846&r2=1.1847
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/testsuite/gas/i386/nops-1-i686.d.diff?cvsroot=src&r1=1.4&r2=1.5
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/testsuite/gas/i386/nops-3-i686.d.diff?cvsroot=src&r1=1.3&r2=1.4
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/testsuite/gas/i386/nops-4-i686.d.diff?cvsroot=src&r1=1.3&r2=1.4
Comment 31 H.J. Lu 2011-02-08 20:24:15 IST
Fixed.
Comment 32 cvs-commit@gcc.gnu.org 2011-03-23 16:00:40 IST
CVSROOT:	/cvs/src
Module name:	src
Branch: 	binutils-2_21-branch
Changes by:	gingold@sourceware.org	2011-03-23 16:00:21

Modified files:
	gas            : ChangeLog 
	gas/config     : tc-i386.c 
	gas/testsuite  : ChangeLog 
	gas/testsuite/gas/i386: nops-1-i686.d nops-3-i686.d 
	                        nops-4-i686.d 

Log message:
	Backport of:
	Changes by:	hjl@sourceware.org	2011-02-08 20:21:26
	
	Modified files:
	gas            : ChangeLog
	gas/config     : tc-i386.c
	gas/testsuite  : ChangeLog
	gas/testsuite/gas/i386: nops-1-i686.d nops-3-i686.d
	nops-4-i686.d
	
	Log message:
	Use f32_patt in i386_align_code  when tuning for i686.
	
	gas/
	
	2011-02-08  H.J. Lu  <hongjiu.lu@intel.com>
	
	PR gas/6957
	* config/tc-i386.c (i386_align_code): Use f32_patt when tuning
	for i686.
	
	gas/testsuite/
	
	2011-02-08  H.J. Lu  <hongjiu.lu@intel.com>
	
	PR gas/6957
	* gas/i386/nops-1-i686.d: Updated.
	* gas/i386/nops-3-i686.d: Likewise.
	* gas/i386/nops-4-i686.d: Likewise.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/ChangeLog.diff?cvsroot=src&only_with_tag=binutils-2_21-branch&r1=1.4320.2.22&r2=1.4320.2.23
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/config/tc-i386.c.diff?cvsroot=src&only_with_tag=binutils-2_21-branch&r1=1.452&r2=1.452.2.1
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/testsuite/ChangeLog.diff?cvsroot=src&only_with_tag=binutils-2_21-branch&r1=1.1802.2.7&r2=1.1802.2.8
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/testsuite/gas/i386/nops-1-i686.d.diff?cvsroot=src&only_with_tag=binutils-2_21-branch&r1=1.4&r2=1.4.2.1
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/testsuite/gas/i386/nops-3-i686.d.diff?cvsroot=src&only_with_tag=binutils-2_21-branch&r1=1.3&r2=1.3.2.1
http://sourceware.org/cgi-bin/cvsweb.cgi/src/gas/testsuite/gas/i386/nops-4-i686.d.diff?cvsroot=src&only_with_tag=binutils-2_21-branch&r1=1.3&r2=1.3.2.1
Comment 33 H.J. Lu 2012-05-16 02:42:54 IST
*** Bug 14066 has been marked as a duplicate of this bug. ***