Performance problem with gcc 4.9.2-3 on 64 bit
Bengt Larsson
lists.cygwin4@bengtl.net
Fri Feb 27 19:05:00 GMT 2015
Below are two benchmarks that explore maximum floating point
performance. loopm6 is double precision floating point and loopm6fp is
parallell single-precision. They are manually unrolled multiply-add
loops.
I used to reach 2.8 and 11 GFlops on these. Now I only get
2 and 6.
If you explore the inner loop with gcc -O2 -S you can see that it seems
to use few registers.
If you run them, there is a parameter expected. I use 30000 - 50000.
gcc 4.9.2-3 on 64-bit. I use gcc -O2.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: loopm6.c
Type: application/octet-stream
Size: 1075 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin/attachments/20150227/d0a78694/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: loopm6fp.c
Type: application/octet-stream
Size: 1604 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin/attachments/20150227/d0a78694/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: timers.h
Type: application/octet-stream
Size: 840 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin/attachments/20150227/d0a78694/attachment-0002.obj>
-------------- next part --------------
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
More information about the Cygwin
mailing list