This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] x86-64: Add sinf with FMA
On Mon, 4 Dec 2017, H.J. Lu wrote:
> Is
>
> (*__errno_location ()) = xxx
If anything I'd expect a direct TLS initial-exec access to errno to be
faster.
> faster? On x86-64, there should be no call to __floor.
The x86_64 __floor inline in math_private.h is only when compiling glibc
for SSE4.1 or later.
The case of inlining floor / __floor and related functions for x86_64
without SSE4.1 is tricky. Supposing we had appropriate asm redirects to
allow libm to call floor / ceil / trunc etc. directly so the compiler
could inline them but __* are still called if not inlined, the default
SSE2 inlines would come into play. But those inlines are slower on SSE4.1
hardware than an out-of-line call to the floor / ceil / trunc IFUNC, so if
you're building a generic SSE2 glibc that may well be used on SSE4.1
hardware, you may wish either to avoid those inlines or, if there is a
significant performance difference in benchmarks, have an SSE4.1 IFUNC of
the calling function using floor (or __floor, with the present inline).
The expf etc. set of optimized float functions have several different
choices of how conversions to integer are handled, which may be configured
by an architecture. That may make sense in other cases as well.
--
Joseph S. Myers
joseph@codesourcery.com