The following double math functions: e_asin.c e_atan2.c e_exp.c e_log.c e_pow.c s_atan.c s_sin.c s_tan.c are optimized with FMA. Should the corresponding float versions be optimized with FMA?
Those double functions are probably optimized for FMA because they use DLA_FMS, or the other macros that have definitions depending on DLA_FMS. That is, they can use fused operations directly, not just via contraction. Since the float functions are completely separate implementations, none of them using fused operations, there is no reason to expect that FMA versions of them would be helpful just based on the corresponding double versions using FMA (although it's possible there are smaller optimization opportunities arising from contraction - not necessarily for the same set of functions).
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been created at a696cda418879c4d0f03d7cfe19e81655a5815e5 (commit) - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a696cda418879c4d0f03d7cfe19e81655a5815e5 commit a696cda418879c4d0f03d7cfe19e81655a5815e5 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 15 08:45:34 2017 -0700 x86-64: Optimize e_expf with FMA [BZ #21912] [BZ #21912] * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf-sse2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise. -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been deleted was a696cda418879c4d0f03d7cfe19e81655a5815e5 - Log ----------------------------------------------------------------- a696cda418879c4d0f03d7cfe19e81655a5815e5 x86-64: Optimize e_expf with FMA [BZ #21912] -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been created at a13f5e6e34a6160607c8ce9448c618b9ae024364 (commit) - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a13f5e6e34a6160607c8ce9448c618b9ae024364 commit a13f5e6e34a6160607c8ce9448c618b9ae024364 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 15 08:45:34 2017 -0700 x86-64: Optimize e_expf with FMA [BZ #21912] [BZ #21912] * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf-sse2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise. https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5c18dfae535d8dd308a034280176c771b4065664 commit 5c18dfae535d8dd308a034280176c771b4065664 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 15 10:34:22 2017 -0700 x86-64: Put L(SP_INF_0) in .rodata.cst4 section [BZ #21955] sysdeps/x86_64/fpu/e_expf.S has /* Here if |x| is Inf */ lea L(SP_INF_0)(%rip), %rdx /* depending on sign of x: */ movss (%rdx,%rax,4), %xmm0 /* return zero or Inf */ ret ... .section .rodata.cst8,"aM",@progbits,8 ... .p2align 2 L(SP_INF_0): .long 0x7f800000 /* single precision Inf */ .long 0 /* single precision zero */ .type L(SP_INF_0), @object ASM_SIZE_DIRECTIVE(L(SP_INF_0)) Since L(SP_INF_0) is accessed as an array of 4-byte elements, it should be placed in .section .rodata.cst4,"aM",@progbits,4 [BZ #21955] * sysdeps/x86_64/fpu/e_expf.S (L(SP_INF_0)): Place it in .rodata.cst4 section. -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been created at c7925193663829d4bc770cf33c5d76aeeb7d25cd (commit) - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c7925193663829d4bc770cf33c5d76aeeb7d25cd commit c7925193663829d4bc770cf33c5d76aeeb7d25cd Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 15 08:45:34 2017 -0700 x86-64: Optimize e_expf with FMA [BZ #21912] [BZ #21912] * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf-sse2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise. -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been deleted was c7925193663829d4bc770cf33c5d76aeeb7d25cd - Log ----------------------------------------------------------------- c7925193663829d4bc770cf33c5d76aeeb7d25cd x86-64: Optimize e_expf with FMA [BZ #21912] -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been created at bff7a64f05d2e32a472019948b6ff1fc95a088c0 (commit) - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=bff7a64f05d2e32a472019948b6ff1fc95a088c0 commit bff7a64f05d2e32a472019948b6ff1fc95a088c0 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 15 08:45:34 2017 -0700 x86-64: Optimize e_expf with FMA [BZ #21912] [BZ #21912] * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf-sse2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise. -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been deleted was bff7a64f05d2e32a472019948b6ff1fc95a088c0 - Log ----------------------------------------------------------------- bff7a64f05d2e32a472019948b6ff1fc95a088c0 x86-64: Optimize e_expf with FMA [BZ #21912] -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been created at 110255b8afd79a0d82a08f87df0619fef318cb48 (commit) - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=110255b8afd79a0d82a08f87df0619fef318cb48 commit 110255b8afd79a0d82a08f87df0619fef318cb48 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 15 08:45:34 2017 -0700 x86-64: Optimize e_expf with FMA [BZ #21912] [BZ #21912] * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf-sse2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise. -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been deleted was 110255b8afd79a0d82a08f87df0619fef318cb48 - Log ----------------------------------------------------------------- 110255b8afd79a0d82a08f87df0619fef318cb48 x86-64: Optimize e_expf with FMA [BZ #21912] -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been created at ea131621a68ccfa7247b8ef596df3cee1fb3c528 (commit) - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ea131621a68ccfa7247b8ef596df3cee1fb3c528 commit ea131621a68ccfa7247b8ef596df3cee1fb3c528 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 15 08:45:34 2017 -0700 x86-64: Optimize e_expf with FMA [BZ #21912] [BZ #21912] * sysdeps/x86_64/fpu/e_expf.S (L(DP_T)): Renamed to ... (__ieee754_expf_dp_table): This. Mark it hidden and global. * ysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_expf-sse2 and e_expf-fma. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf-sse2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise. -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been deleted was ea131621a68ccfa7247b8ef596df3cee1fb3c528 - Log ----------------------------------------------------------------- ea131621a68ccfa7247b8ef596df3cee1fb3c528 x86-64: Optimize e_expf with FMA [BZ #21912] -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been created at 1754ccc618796f240aa2852cb56eebd5bbd87581 (commit) - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=1754ccc618796f240aa2852cb56eebd5bbd87581 commit 1754ccc618796f240aa2852cb56eebd5bbd87581 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 15 08:45:34 2017 -0700 x86-64: Optimize e_expf with FMA [BZ #21912] [BZ #21912] * sysdeps/x86_64/fpu/e_expf.S (L(DP_T)): Renamed to ... (__ieee754_expf_dp_table): This. Mark it hidden and global. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_expf-sse2 and e_expf-fma. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf-sse2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise. -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been deleted was 1754ccc618796f240aa2852cb56eebd5bbd87581 - Log ----------------------------------------------------------------- 1754ccc618796f240aa2852cb56eebd5bbd87581 x86-64: Optimize e_expf with FMA [BZ #21912] -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been created at 0eb9d03ad3c70631e281d37e95543ae7bc29c552 (commit) - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=0eb9d03ad3c70631e281d37e95543ae7bc29c552 commit 0eb9d03ad3c70631e281d37e95543ae7bc29c552 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 15 08:45:34 2017 -0700 x86-64: Optimize e_expf with FMA [BZ #21912] [BZ #21912] * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add and e_expf-fma. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise. -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been deleted was 0eb9d03ad3c70631e281d37e95543ae7bc29c552 - Log ----------------------------------------------------------------- 0eb9d03ad3c70631e281d37e95543ae7bc29c552 x86-64: Optimize e_expf with FMA [BZ #21912] -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been created at e0e4259030c1c025bd443e49e6095e7dc4367db5 (commit) - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=e0e4259030c1c025bd443e49e6095e7dc4367db5 commit e0e4259030c1c025bd443e49e6095e7dc4367db5 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 15 08:45:34 2017 -0700 x86-64: Optimize e_expf with FMA [BZ #21912] FMA optimized e_expf improves performance by more than 50% on Skylake. [BZ #21912] * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add and e_expf-fma. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise. -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been deleted was e0e4259030c1c025bd443e49e6095e7dc4367db5 - Log ----------------------------------------------------------------- e0e4259030c1c025bd443e49e6095e7dc4367db5 x86-64: Optimize e_expf with FMA [BZ #21912] -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been created at 0575d372670fd8b06c7366a29c133581f0c696f6 (commit) - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=0575d372670fd8b06c7366a29c133581f0c696f6 commit 0575d372670fd8b06c7366a29c133581f0c696f6 Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 15 08:45:34 2017 -0700 x86-64: Optimize e_expf with FMA [BZ #21912] FMA optimized e_expf improves performance by more than 50% on Skylake. [BZ #21912] * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add and e_expf-fma. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise. -----------------------------------------------------------------------
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, master has been updated via 24a2e6588d2e0c91b4003878b0625d4a9360e8f3 (commit) from 403143e1df85dadd374f304bd891be0cd7573e3b (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=24a2e6588d2e0c91b4003878b0625d4a9360e8f3 commit 24a2e6588d2e0c91b4003878b0625d4a9360e8f3 Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Aug 16 08:43:35 2017 -0700 x86-64: Optimize e_expf with FMA [BZ #21912] FMA optimized e_expf improves performance by more than 50% on Skylake. [BZ #21912] * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_expf-fma. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise. ----------------------------------------------------------------------- Summary of changes: ChangeLog | 9 ++ sysdeps/x86_64/fpu/multiarch/Makefile | 3 + sysdeps/x86_64/fpu/multiarch/e_expf-fma.S | 182 +++++++++++++++++++++++++++++ sysdeps/x86_64/fpu/multiarch/e_expf.c | 26 ++++ sysdeps/x86_64/fpu/multiarch/ifunc-fma.h | 34 ++++++ 5 files changed, 254 insertions(+), 0 deletions(-) create mode 100644 sysdeps/x86_64/fpu/multiarch/e_expf-fma.S create mode 100644 sysdeps/x86_64/fpu/multiarch/e_expf.c create mode 100644 sysdeps/x86_64/fpu/multiarch/ifunc-fma.h
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/fma/2.26 has been updated via 6d5f5b16bc4bd3945e138509d7986a5231ab5ee6 (commit) via ce3e7f4136a9f5943328c74511542834ca05811b (commit) from 7e7b5de8ffc9ac8fda45b988cde5650004bdbca7 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6d5f5b16bc4bd3945e138509d7986a5231ab5ee6 commit 6d5f5b16bc4bd3945e138509d7986a5231ab5ee6 Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Aug 16 08:43:35 2017 -0700 x86-64: Optimize e_expf with FMA [BZ #21912] FMA optimized e_expf improves performance by more than 50% on Skylake. [BZ #21912] * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_expf-fma. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file. * sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise. (cherry picked from commit 24a2e6588d2e0c91b4003878b0625d4a9360e8f3) https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ce3e7f4136a9f5943328c74511542834ca05811b commit ce3e7f4136a9f5943328c74511542834ca05811b Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Aug 15 14:04:59 2017 -0700 x86-64: Align L(SP_RANGE)/L(SP_INF_0) to 8 bytes [BZ #21955] sysdeps/x86_64/fpu/e_expf.S has lea L(SP_RANGE)(%rip), %rdx /* load over/underflow bound */ cmpl (%rdx,%rax,4), %ecx /* |x|<under/overflow bound ? */ ... /* Here if |x| is Inf */ lea L(SP_INF_0)(%rip), %rdx /* depending on sign of x: */ movss (%rdx,%rax,4), %xmm0 /* return zero or Inf */ ret ... .section .rodata.cst8,"aM",@progbits,8 ... .p2align 2 L(SP_RANGE): /* single precision overflow/underflow bounds */ .long 0x42b17217 /* if x>this bound, then result overflows */ .long 0x42cff1b4 /* if x<this bound, then result underflows */ .type L(SP_RANGE), @object ASM_SIZE_DIRECTIVE(L(SP_RANGE)) .p2align 2 L(SP_INF_0): .long 0x7f800000 /* single precision Inf */ .long 0 /* single precision zero */ .type L(SP_INF_0), @object ASM_SIZE_DIRECTIVE(L(SP_INF_0)) Since L(SP_RANGE) and L(SP_INF_0) are in .rodata.cst8 section, they must be aligned to 8 bytes. [BZ #21955] * sysdeps/x86_64/fpu/e_expf.S (L(SP_RANGE)): Aligned to 8 bytes. (L(SP_INF_0)): Likewise. (cherry picked from commit f59f7adb4a00b7784cab1becdf257366104587b7) ----------------------------------------------------------------------- Summary of changes: sysdeps/x86_64/fpu/e_expf.S | 4 +- sysdeps/x86_64/fpu/multiarch/Makefile | 3 + sysdeps/x86_64/fpu/multiarch/e_expf-fma.S | 182 +++++++++++++++++++++++++++++ sysdeps/x86_64/fpu/multiarch/e_expf.c | 26 ++++ sysdeps/x86_64/fpu/multiarch/ifunc-fma.h | 34 ++++++ 5 files changed, 247 insertions(+), 2 deletions(-) create mode 100644 sysdeps/x86_64/fpu/multiarch/e_expf-fma.S create mode 100644 sysdeps/x86_64/fpu/multiarch/e_expf.c create mode 100644 sysdeps/x86_64/fpu/multiarch/ifunc-fma.h
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, hjl/expf/master has been deleted was 0575d372670fd8b06c7366a29c133581f0c696f6 - Log ----------------------------------------------------------------- 0575d372670fd8b06c7366a29c133581f0c696f6 x86-64: Optimize e_expf with FMA [BZ #21912] -----------------------------------------------------------------------
Since you've added FMA versions of several float functions, do you believe any issue remains here (any more that are appropriate for such optimized versions)?
Will open a new bug if needed.