Bug 14496 - Bytemark FOURIER 54% slower
Summary: Bytemark FOURIER 54% slower
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: math (show other bugs)
Version: 2.16
: P2 normal
Target Milestone: ---
Assignee: Siddhesh Poyarekar
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-08-19 08:17 UTC by wbrana
Modified: 2014-06-17 18:33 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description wbrana 2012-08-19 08:17:23 UTC
http://www.tux.org/~mayer/linux/nbench-byte-2.2.3.tar.gz

glibc 2.16
FOURIER             :           16696  :      18.99  :      10.66

glibc 2.14.1
FOURIER             :           36552  :      41.57  :      23.35

CFLAGS = -ggdb -Wall -O3 -funroll-loops -g0 -march=core2 -fomit-frame-pointer -ffast-math -mssse3 -fno-PIE -fno-exceptions -fno-stack-protector -static
Comment 1 Markus Trippelsdorf 2012-08-19 09:31:02 UTC
Dup of Bug 14412

*** This bug has been marked as a duplicate of bug 14412 ***
Comment 2 wbrana 2012-08-19 16:16:43 UTC
I restored file sysdeps/x86_64/fpu/s_sincos.S and it didn't help.
Comment 3 wbrana 2012-08-19 18:41:44 UTC
FOURIER with sysdeps/x86_64/fpu/s_sincos.S restored

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 27.57      6.89     6.89                             __ieee754_pow_sse2
 25.47     13.25     6.36                             feraiseexcept
 15.26     17.06     3.81                             __sin_sse2
 14.46     20.67     3.61                             __cos_sse2
 10.21     23.22     2.55                             __exp1
  4.02     24.22     1.01                             __dubsin
  2.40     24.82     0.60     2111     0.28     0.28  DoFPUTransIteration
  0.20     24.87     0.05                             csloww
  0.08     24.89     0.02                             __ieee754_exp_sse2
  0.08     24.91     0.02                             __mpexp_avx
  0.08     24.93     0.02                             csloww1
  0.04     24.94     0.01                             __write_nocancel
  0.04     24.95     0.01                             bsloww2
  0.04     24.96     0.01                             sincos
  0.02     24.97     0.01                             __branred
  0.02     24.97     0.01                             checkint
  0.00     24.97     0.00     2111     0.00     0.00  StartStopwatch
  0.00     24.97     0.00     2111     0.00     0.00  StopStopwatch
  0.00     24.97     0.00     2110     0.00     0.00  TicksToSecs
  0.00     24.97     0.00       10     0.00     0.00  AllocateMemory
  0.00     24.97     0.00       10     0.00     0.00  FreeMemory
  0.00     24.97     0.00        5     0.00   120.00  DoFourier
  0.00     24.97     0.00        5     0.00     0.00  TicksToFracSecs
  0.00     24.97     0.00        1     0.00     0.00  main
Comment 4 wbrana 2012-08-19 18:52:29 UTC
FOURIER with glibc 2.14.1

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 27.47      6.87     6.87                             __ieee754_pow
 20.43     11.98     5.11                             __exp1
 18.59     16.63     4.65                             sin
 15.87     20.60     3.97                             cos
  7.96     22.59     1.99                             __dubsin
  3.72     23.52     0.93     3030     0.31     0.31  DoFPUTransIteration
  2.60     24.17     0.65                             pow
  1.60     24.57     0.40                             isnan
  0.80     24.77     0.20                             finite
  0.64     24.93     0.16                             csloww
  0.28     25.00     0.07                             csloww1
  0.04     25.01     0.01                             clock
  0.00     25.01     0.00     3030     0.00     0.00  StartStopwatch
  0.00     25.01     0.00     3030     0.00     0.00  StopStopwatch
  0.00     25.01     0.00     3028     0.00     0.00  TicksToSecs
  0.00     25.01     0.00       12     0.00     0.00  AllocateMemory
  0.00     25.01     0.00       12     0.00     0.00  FreeMemory
  0.00     25.01     0.00        5     0.00   186.00  DoFourier
  0.00     25.01     0.00        5     0.00     0.00  TicksToFracSecs
  0.00     25.01     0.00        1     0.00     0.00  main
Comment 5 wbrana 2012-08-19 20:57:01 UTC
2.15 isn't broken.
Comment 6 Siddhesh Poyarekar 2013-01-09 09:39:52 UTC
The reason is most likely b7cd39e8f8c5cf2844f20eb03f545d19c4c25987.  I've seen reports of performance degradation due to this patch in RHEL and is consistent with the additional feraiseexcept calls.