This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Optimized generic expf and exp2f


On 9/6/2017 7:14 AM, Szabolcs Nagy wrote:
interesting; it takes 2 independent FP adds and a compare (in C) to detect nearest rounding
being in effect (which in time can overlap with the float->double conversion)
so if there's an option to reduce the algorithm by more than that for a fast
path...

(also, some CPUs (like newer Intel) support an instruction prefix encoding to force
rounding modes on a FP instruction independent of the global rounding mode,
which at some point maybe should be a gcc pragma or attribute or something,
and then used in such C code)


i don't think reducing the polynomial (from order 3 to order 2)
is possible without bigger lookup table, if less accuracy is
enough then reducing the table size is possible though:

poly order / table len / ulp error / non-nearest ulp error (rounded)
2          / 64        / 0.61      /
2          / 128       / 0.51      /
2          / 256       / 0.502     /
3          / 8         / 0.91      / > 10
3          / 16        / 0.526     / 2
3          / 32        / 0.502     / 1
3          / 64        / 0.5001    / 1
4          / 8         / 0.54      /
4          / 16        / 0.501     /
4          / 32        / 0.50004   /
4          / 64        / 0.5       /

the c code uses order=3/table=32, the x86_64 asm uses order=4/table=64


yeah I don't think it'll work out in terms of saving cycles; on Intel at least
FMA is 4 cycles, but an ADD is 4 cycles as well, so there's no net savings
by doing the 2xADD+compare to save an FMA.
(since the ADDs execute in parallel it's also not likely to be more expensive)

being able to force rounding might still be interesting  since it avoids the whole
right column of your table


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]