PATCH: optimized libm single precision routines: erfcf, erff, expf for x86_64.

Thu Feb 16 22:26:00 GMT 2012

On 02/16/2012 12:11 PM, Dmitrieva Liubov wrote:
> +	movss	%xmm0, -16(%rsp)	/* save SP x*K/log(2)+RS */
> +	movss	-16(%rsp), %xmm1	/* load SP x*K/log(2)+RS */

What's up with these sorts of obvious compiler-generated bits of silliness?

You stated that you do not plan to provide the C source because you "believe
that the assembly should be faster."  Given turds like the above, I do not
accept this assertion without proof.

Given this routine does all scalar code, I don't see why it might not be
faster for all of the other targets as well.

r~