This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: possible compiler optimization error

From: Brian Dessent <brian at dessent dot net>
To: cygwin at cygwin dot com
Date: Thu, 28 Jun 2007 13:36:11 -0700
Subject: Re: possible compiler optimization error
References: <BAY108-F1181298FD58A847ABB9088BE090@phx.gbl> <46840F0E.9EA8612B@dessent.net> <C6EEDB0EB45A56439F73B1D23E39694A35C856@USORL02P702.ww007.siemens.net>
Reply-to: cygwin at cygwin dot com

"Frederich, Eric P21322" wrote:

> You said that combining -march=i686 and -msse2 didn't make too much
> sense.

What I meant by that is that by specifying -msse2 you are setting the
bar a lot higher than -march=i686, generating code that won't run on a
number of i686 machines, so you might as well use a more specifc -march
that includes sse2 anyway.

> So without setting -march, what all should I be setting?
> On my laptop with CPU-Z I see MMX, SSE, and SSE2.
> On my Opteron Linux box I obviously see a lot more when I cat
> /proc/cpuinfo.

If you're compiling code for yourself then just use the appropriate arch
for each machine, -march=pentium-m and -march=opteron respectively.

If you're going to distribute binaries to others than I guess it gets a
little more complicated.  If you're comfortable requiring sse2 then I
suppose -march=i686 -msse2 is reasonable.  You might also test
-march=i686 -mtune=pentium4 -msse2.  What this means is choose the
instruction set of generic i686 plus sse2, but choose the scheduler for
p4.  Due to its huge pipeline the p4 is more sensitive to scheduling
than the other sse2-class machines like k8, so in theory this means a
small performance win on p4 machines without much (if any) cost on
k8/core2/whatever.  But I might be wrong here, so if performance is of
any concern you should test it.

> If I just use what is common between them, -mmmx, -msse, and -msse2 I
> should be free of floating point errors and hopefully get some
> performance increase.  Should I be using -mmmx if I'm also using -msse
> and -msse2?

Well, first of all, yes, any machine capable of sse2 will also include
sse and mmx, so it's redundant to specify all three.  But sse and mmx
aren't really relevant at all if you're using 'double' types.

Brian

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

References:
- Re: possible compiler optimization error
  - From: Mike Marchywka
- Re: possible compiler optimization error
  - From: Brian Dessent
- RE: possible compiler optimization error
  - From: Frederich, Eric P21322

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]