Are you optimising properly?

Eric Nelson Eric.Nelson@ticketmaster.com
Sun Jan 14 08:35:00 GMT 2001


This probably has to do with cache hits. Modern processors are
sufficiently fast that cache misses dictate performance in many 
algorithms more than clocks/instruction.


> -----Original Message-----
> From:	Robert de Bath [SMTP:robert$@home-box.demon.co.uk]
> Sent:	Sunday, January 14, 2001 4:47 AM
> To:	crossgcc@sources.redhat.com
> Subject:	Are you optimising properly?
> 
> I has a thought today ... These modern processors, Athlons PIII's even
> the now venerable K6 try very hard to optimise the x86 instructions they
> collect into jobs for all their internal units.  It seemed to me that
> compiler optimisation wouldn't make any difference 'cause the processor
> does so much itself.
> 
> But it does ... in an unexpected way.
> 
> I dusted off a copy of the old dhrystone 1.1 benchmark and complied it up
> ...
> The numbers were all in the supercomputer range of course :-)
> 
> Riddle me this ... why does it run about 10% faster when I optimise for
> space than it does when I go all out for speed?
> 
> $ gcc -v
> Reading specs from /usr/lib/gcc-lib/i386-linux/2.95.2/specs
> gcc version 2.95.2 20000220 (Debian GNU/Linux)
> 
> PIII - 750
> 
> $ cc dhrystone.c ; ./a.out
> This machine benchmarks at 633218 dhrystones/second
> $ cc -O2 dhrystone.c ; ./a.out
> This machine benchmarks at 866454 dhrystones/second
> $ cc -O6 dhrystone.c ; ./a.out
> This machine benchmarks at 859864 dhrystones/second
> $ cc -Os dhrystone.c ; ./a.out
> This machine benchmarks at 982045 dhrystones/second
> $
> 
> K6-II 450
> $ cc dhrystone.c ; ./a.out
> This machine benchmarks at 332348 dhrystones/second
> $ cc -O2 dhrystone.c ; ./a.out
> This machine benchmarks at 487460 dhrystones/second
> $ cc -O6 dhrystone.c ; ./a.out
> This machine benchmarks at 496377 dhrystones/second
> $ cc -Os dhrystone.c ; ./a.out
> This machine benchmarks at 539380 dhrystones/second
> $ cc -mcpu=k6 -Os dhrystone.c ; ./a.out
> This machine benchmarks at 541724 dhrystones/second
> $ cc -march=k6 -Os dhrystone.c ; ./a.out
> This machine benchmarks at 542061 dhrystones/second
> $
> 
> It seems to apply to other programs too, not just synthetic benchmarks.
> It makes me wonder if everybody else is wrong. :-)
> 
> The -mcpu= options have a small effect, -march has a tiny bit more.
> 
> The K6-II runs hotter too with -Os, implying that it _is_ doing more work.
> 
> -- 
> Rob.                          (Robert de Bath
> < http://poboxes.com/rdebath >)
>                     <rdebath @ poboxes.com> < http://www.cix.co.uk/~mayday >
> 
> 
> 
> 
> ------
> Want more information?  See the CrossGCC FAQ,
> http://www.objsw.com/CrossGCC/
> Want to unsubscribe? Send a note to
> crossgcc-unsubscribe@sourceware.cygnus.com

------
Want more information?  See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com



More information about the crossgcc mailing list