This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 1/2] ieee754: Remove slow paths from asin and acos


On 27.1.2020 14.51, Szabolcs Nagy wrote:
> On 27/01/2020 10:45, Anssi Hannula wrote:
>> asin and acos have slow paths for rounding the last bit that cause some
>> calls to be 500-1500x slower than average calls.
>>
>> These slow paths are rare, a test of a trillion (1.000.000.000.000)
>> random inputs between -1 and 1 showed 32870 slow calls for acos and 4473
>> for asin, with most occurrences between -1.0 .. -0.9 and 0.9 .. 1.0.
>>
>> The slow paths claim correct rounding and use __sin32() and __cos32()
>> (which compare two result candidates and return the closest one) as the
>> final step, with the second result candidate (res1) having a small offset
>> applied from res. This suggests that res and res1 are intended to be 1
>> ULP apart (which makes sense for rounding), barring bugs, allowing us to
>> pick either one and still remain within 1 ULP of the exact result.
>>
>> Remove the slow paths as the accuracy is better than 1 ULP even without
>> them, which is enough for glibc.
>>
>> Also remove code comments claiming correctly rounded results.
> the patch looks good to me.
>
>> After slow path removal, checking the accuracy of 14.400.000.000 random
>> asin() and acos() inputs showed only three incorrectly rounded
>> (error > 0.5 ULP) results:
>> - asin(-0x1.ee2b43286db75p-1) (0.500002 ULP, same as before)
>> - asin(-0x1.f692ba202abcp-4)  (0.500003 ULP, same as before)
>> - asin(-0x1.9915e876fc062p-1) (0.50000000001 ULP, previously exact)
>> The first two had the same error even before this commit, and they did
>> not use the slow path at all.
>>
>> Checking 4934 known randomly found previously-slow-path asin inputs
>> shows 25 calls with incorrectly rounded results, with a maximum error of
>> 0.500000002 ULP (for 0x1.fcd5742999ab8p-1). The previous slow-path code
>> rounded all these inputs correctly (error < 0.5 ULP).
>> The observed average speed increase was 130x.
>>
>> Checking 36240 known randomly found previously-slow-path acos inputs
>> shows 42 calls with incorrectly rounded results, with a maximum error of
>> 0.500000008 ULP (for 0x1.f63845056f35ep-1). The previous "exact"
>> slow-path code showed 34 calls with incorrectly rounded results, with the
>> same maximum error of 0.500000008 ULP (for 0x1.f63845056f35ep-1).
>> The observed average speed increase was 130x.
>>
>> The functions could likely be trimmed more while keeping acceptable
>> accuracy, but this at least gets rid of the egregiously slow cases.
>>
>> Tested on x86_64.
>> ---
> ...
>>  double
>>  SECTION
>>  __ieee754_asin(double x){
>> @@ -100,13 +88,7 @@ __ieee754_asin(double x){
>>        if (res == res+1.00014*cor) return res;
>>        else {
>>  	__doasin(x,0,w);
>> -	if (w[0]==(w[0]+1.00000001*w[1])) return w[0];
>> -	else {
>> -	  y=fabs(x);
>> -	  res=fabs(w[0]);
>> -	  res1=fabs(w[0]+1.1*w[1]);
>> -	  return (m>0)?__sin32(y,res,res1):-__sin32(y,res,res1);
>> -	}
>> +	return w[0];
> if we assume the code is correctly rounded (though your tests
> show it isnt), then we can derive error bounds from the
> checks, e.g.
>
>   if (res == res+1.00014*cor) return res;
>
> here means that the computed result is res + cor, where
> |cor| < 0.5 ulp (in nearest rounding mode), correct rounding
> would require the exact result res + xcor with
> |xcor| < 1.00014*|cor| < .50007 ulp.
>
> so in this case i think that earlier if can be removed too
> with a comment saying that worst case error is expected to
> be .50007 ulp in nearest rounding mode.

Indeed, I suspected as much.
I wasn't really comfortable trying to derive error bounds (and/or remove
more code) as the original results were not 100% accurate and I'm not
too much of an expert on this.

> it would be nice to do similar analysis in other cases
> and have some comments in the code about expected error
> bounds. (but if that seems too much work i'm not against
> the proposed patch.)


I'd prefer to keep the patch as-is as long as that is acceptable, as
I've spent a bit too much time on this already :)

Thanks for taking a look.

-- 
Anssi Hannula / Bitwise Oy
+358 503803997


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]