This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH 00/10] Optimized math routines
On 06/07/18 18:17, Szabolcs Nagy wrote:
On 06/07/18 17:27, Carlos O'Donell wrote:
On 07/06/2018 11:46 AM, Szabolcs Nagy wrote:
On 06/07/18 13:43, Carlos O'Donell wrote:
On 07/06/2018 04:47 AM, Szabolcs Nagy wrote:
Optimized exp, exp2, log, log2, pow, sinf, cosf and sincosf
Is it your intent to have these included in 2.28?
(resending as my previous mail seems to be lost)
yes, i'd like to add it to the 'desirable in 2.28' list
if Joseph is ok with the code, but i see he is not available
right now for review.
i don't know how other maintainers feel about such change,
there needs to be an ulp update (i'm willing to do that for
targets i can access hw for testing).
Where there any unanswered questions in your v4 review?
Do you think v4 is basically as good as it will get?
Who were the people who signed off on the review?
Joseph Myers started the review of both the sinf, cosf, sincosf
changes and the exp, exp2, log, log2, pow changes.
I think I addressed all of his comments in an acceptable way,
but i don't know if he had other concerns or if parts of the
code he has not reviewed yet.
Since the glibc tests pass on 3 different targets (and
build-many-glibcs.py) i think there is no danger of the
patch being completely broken. Wilco and I tested the patches
in detail outside of glibc so it is the glibc integration where
I expected most of the issues.
I don't expect performance regression on any target, but it
was not measured e.g. on powerpc (only aarch64 and x86_64)
which might have different behaviour (previous sincosf was
optimized on that target hence it might make sense to retest
the new code to be sure).
I think the patches are in a good quality state now.
(The ABI changing part needs further work so i didn't post that.)
built and tested on a power8 machine now, glibc math
tests pass (except for an unrelated fmal failure),
benchmark improvements are consistent with aarch64/x86_64,
but it was a shared access machine so i won't post exact
numbers, sincosf improved a bit too, sinf/cosf didn't
(apparently powerpc has its own implementation).