This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

PATCH x86_64 only: optimize sinf and cosf with SSE2


Hello,

In addition to similar patch for x86_32 this patch proposes
highly-performance sinf and cosf for x86_64.

This table represents performance boost we've reached. (ratio of
execution time in clocks given for random value of the interval)

                    Ist.    Bulld.  Atom    Neh.    AVX
cosf    |x|<0.78    1,90    2,45    1,53    1,78    1,55  times
cosf    |x|<1.57    1,41    1,58    1,49    1,57    1,42  times
cosf    |x|<2.35    1,50    1,82    1,51    1,51    1,27  times
cosf    |x|<3.14    1,81    2,03    1,67    1,58    1,73  times
cosf    |x|<3.92    1,90    2,11    1,78    1,68    1,96  times
cosf    |x|<4.71    1,89    2,03    1,83    1,80    1,98  times
cosf    |x|<5.49    1,97    2,41    1,89    1,84    2,00  times
cosf    |x|<6.28    2,00    2,25    1,92    1,85    2,06  times
cosf    |x|<7.06    1,98    2,08    1,95    1,93    2,19  times
cosf    |x|<7.85    1,87    1,95    1,93    1,89    2,13  times
cosf    |x|<8.63    1,73    1,80    1,91    1,84    2,12  times
cosf    |x|<9.42    1,65    1,71    1,87    1,75    2,09  times
cosf    |x|<100     1,87    2,04    1,83    2,16    1,90  times
cosf    |x|<1000    17,81   18,62   20,58   18,44   20,72 times
cosf    |x|<10000   22,69   23,88   25,52   22,53   25,58 times
cosf    |x|<1e10    15,89   17,81   20,07   13,20   16,62 times
sinf    |x|<0.78    1,04    1,15    1,10    1,00    1,13  times
sinf    |x|<1.57    1,33    1,53    1,39    1,59    1,39  times
sinf    |x|<2.35    1,57    2,03    1,50    1,72    1,48  times
sinf    |x|<3.14    1,79    1,97    1,66    1,65    1,66  times
sinf    |x|<3.92    1,83    1,95    1,73    1,62    1,76  times
sinf    |x|<4.71    1,92    2,08    1,83    1,82    1,90  times
sinf    |x|<5.49    2,02    2,36    1,92    1,96    2,02  times
sinf    |x|<6.28    2,01    2,23    1,92    1,96    2,04  times
sinf    |x|<7.06    1,93    2,03    1,89    1,87    1,99  times
sinf    |x|<7.85    1,87    1,87    1,89    1,86    2,02  times
sinf    |x|<8.63    1,74    1,83    1,87    1,82    2,01  times
sinf    |x|<9.42    1,68    1,72    1,82    1,73    1,98  times
sinf    |x|<100     1,88    2,00    1,82    2,13    1,84  times
sinf    |x|<1000    17,77   18,57   20,66   18,42   20,43 times
sinf    |x|<10000   22,84   23,68   25,63   22,51   25,57 times
sinf    |x|<1e10    15,91   17,73   20,04   13,00   16,79 times


After <http://sourceware.org/ml/libc-alpha/2012-06/msg00650.html> fix
applying our test system observes maximum just 3.54 ulp error for sinf
and 3.9 ulp for cosf for current versions.
But new asm versions, provided here, are maximum 0.500121 ulp for
sinf, 0.500573 ulp for cosf.

Testing passed for new sinf/cosf with our proprietary test system that
tests on many intervals with different steps, checks for special
values (from ISO C) and corner cases.
Test using “make check” from GLIBC was ok too after fixing libm-test-ulps file.
Without fixing libm-test-ulps file GLIBC tests on x86-64 show errors,
but really results are ok, new functions show correct result, but some
expected are incorrect. You can take a look at attached file
(test-float.out).
Expected value is rounded to nearest instead of needed direction.

ChangeLog:

2012-06-25  Liubov Dmitrieva  <liubov.dmitrieva@gmail.com>

       * sysdeps/x86_64/fpu/s_sinf.S New file.
       * sysdeps/x86_64/fpu/s_cosf.S New file.
       * sysdeps/x86_64/fpu/libm-test-ulps Update.

--
Liubov Dmitrieva
Software Engineer
Intel Corporation

Attachment: sinf_cosf_x86_64.patch
Description: Binary data

Attachment: test-float.out
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]