This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Inline C99 math functions
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: libc-alpha at sourceware dot org
- Date: Tue, 16 Jun 2015 09:31:02 -0300
- Subject: Re: [PATCH] Inline C99 math functions
- Authentication-results: sourceware.org; auth=none
- References: <001201d0a75b$921d9860$b658c920$ at com> <alpine dot DEB dot 2 dot 10 dot 1506151431490 dot 26683 at digraph dot polyomino dot org dot uk> <001701d0a789$f2ab86f0$d80294d0$ at com> <20150615185201 dot GA3023 at domone> <alpine dot DEB dot 2 dot 10 dot 1506152127340 dot 9772 at digraph dot polyomino dot org dot uk> <20150616050045 dot GA8021 at domone>
On 16-06-2015 02:00, OndÅej BÃlka wrote:
> On Mon, Jun 15, 2015 at 09:35:22PM +0000, Joseph Myers wrote:
>> On Mon, 15 Jun 2015, OndÅej BÃlka wrote:
>>
>>> As I wrote in other thread that gcc builtins have poor performance a
>>> benchmark is tricky. Main problem is that these tests are in branch and
>>> gcc will simplify them. As builtins are branchless it obviously couldn't
>>> simplify them.
>>
>> Even a poor benchmark, checked into the benchtests directory, would be a
>> starting point for improved benchmarks as well as for benchmarking any
>> future improvements to these functions. Having a benchmark that everyone
>> can readily use with glibc is better than having a performance improvement
>> asserted in a patch submission without the benchmark being available at
>> all.
>>
> No, a poor benchmark is dangerous and much worse than none at all. With
> poor benchmark you could easily check performance regression that looks
> like improvement on benchmark and you wouldn't notice until some
> developer measures poor performance of his application and finds that
> problem is on his side.
>
> I could almost "show" that fpclassify gcc builtin is slower than library
> call, in benchmark below I exploit branch misprediction to get close. If
> I could use "benchtest" below I could "improve" fpclassify by making
> zero check branchless which would improve benchmark numbers to actually
> beat call overhead. Or I could play with probability of subnormals to
> increase running time of gcc builtin and decrease of library. Moral is
> that with poor benchmark your implementation will be poor as it tries to
> minimize benchmark.
So to make this proposal to move forward, how exactly do you propose to
create a benchtest for such scenario? I get this is tricky and a lot of
variables may apply, but I do agree with Joseph that we shouldn't quite
aim for optimal performance, imho using compiler builtins with reasonable
performance is a gain in code maintainability.
So from various code pieces you have thrown in maillist, I see that we
may focus on a benchmark that uses a random sample with different
probability scenarios FP number types:
1. high prob for normal
2. high prob for nan
3. high prob for inf,
4. high prob for subnormal
And 2, 3, 4 should not be the optimization focus (since they are not the
usual case for mostly of computations and algorithms). Do you propose
something different?