This is the mail archive of the
mailing list for the Cygwin project.
Re: possible compiler optimization error
"Frederich, Eric P21322" wrote:
> You said that combining -march=i686 and -msse2 didn't make too much
What I meant by that is that by specifying -msse2 you are setting the
bar a lot higher than -march=i686, generating code that won't run on a
number of i686 machines, so you might as well use a more specifc -march
that includes sse2 anyway.
> So without setting -march, what all should I be setting?
> On my laptop with CPU-Z I see MMX, SSE, and SSE2.
> On my Opteron Linux box I obviously see a lot more when I cat
If you're compiling code for yourself then just use the appropriate arch
for each machine, -march=pentium-m and -march=opteron respectively.
If you're going to distribute binaries to others than I guess it gets a
little more complicated. If you're comfortable requiring sse2 then I
suppose -march=i686 -msse2 is reasonable. You might also test
-march=i686 -mtune=pentium4 -msse2. What this means is choose the
instruction set of generic i686 plus sse2, but choose the scheduler for
p4. Due to its huge pipeline the p4 is more sensitive to scheduling
than the other sse2-class machines like k8, so in theory this means a
small performance win on p4 machines without much (if any) cost on
k8/core2/whatever. But I might be wrong here, so if performance is of
any concern you should test it.
> If I just use what is common between them, -mmmx, -msse, and -msse2 I
> should be free of floating point errors and hopefully get some
> performance increase. Should I be using -mmmx if I'm also using -msse
> and -msse2?
Well, first of all, yes, any machine capable of sse2 will also include
sse and mmx, so it's redundant to specify all three. But sse and mmx
aren't really relevant at all if you're using 'double' types.
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html