Re: [PATCH] x86-64: Add sinf with FMA

On Mon, 4 Dec 2017, H.J. Lu wrote:

> Is
> (*__errno_location ()) = xxx

If anything I'd expect a direct TLS initial-exec access to errno to be 

> faster?  On x86-64, there should be no call to __floor.

The x86_64 __floor inline in math_private.h is only when compiling glibc 
for SSE4.1 or later.

The case of inlining floor / __floor and related functions for x86_64 
without SSE4.1 is tricky.  Supposing we had appropriate asm redirects to 
allow libm to call floor / ceil / trunc etc. directly so the compiler 
could inline them but __* are still called if not inlined, the default 
SSE2 inlines would come into play.  But those inlines are slower on SSE4.1 
hardware than an out-of-line call to the floor / ceil / trunc IFUNC, so if 
you're building a generic SSE2 glibc that may well be used on SSE4.1 
hardware, you may wish either to avoid those inlines or, if there is a 
significant performance difference in benchmarks, have an SSE4.1 IFUNC of 
the calling function using floor (or __floor, with the present inline).

The expf etc. set of optimized float functions have several different 
choices of how conversions to integer are handled, which may be configured 
by an architecture.  That may make sense in other cases as well.

Joseph S. Myers

