This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On 9/6/2017 6:16 AM, Wilco Dijkstra wrote:
Arjan van de Ven wrote:I'm seeing a 16% throughput increase (not 1.5x) but still impressive.Was that using the expf trace input or something else? And with wrapper?I do see different numerical answers between the two (I had to disable the code in my bench that detects differences) and sampling a few it seems that the C code is a little bit less accurate in places, likely a simpler polynomal. (for example for 20.636783599853515625 as input)It's still way more accurate than necessary. The only reason is to minimize ULP error for non-nearest rounding modes. If you don't care about worst-case ULP for non-standard rounding modes, the polynomial can be further simplified within 1ULP max error in round to nearest.
interesting; it takes 2 independent FP adds and a compare (in C) to detect nearest rounding being in effect (which in time can overlap with the float->double conversion) so if there's an option to reduce the algorithm by more than that for a fast path... (also, some CPUs (like newer Intel) support an instruction prefix encoding to force rounding modes on a FP instruction independent of the global rounding mode, which at some point maybe should be a gcc pragma or attribute or something, and then used in such C code)
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |