This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Add math-inline benchmark


On Mon, Jul 13, 2015 at 12:02:51PM +0100, Wilco Dijkstra wrote:
> > OndÅej BÃlka wrote:
> > On Fri, Jul 10, 2015 at 05:09:16PM +0100, Wilco Dijkstra wrote:
> > > > OndÅej BÃlka wrote:
> > > > On Mon, Jul 06, 2015 at 03:50:11PM +0100, Wilco Dijkstra wrote:
> > > > >
> > > > >
> > > > > > OndÅej BÃlka wrote:
> > > > > > But with latency hiding by using argument first suddenly even isnan and
> > > > > > isnormal become regression.
> > > > > >
> > >
> > > That doesn't look correct - it looks like this didn't use the built-ins at all,
> > > did you forget to apply that patch?
> > >
> > No, from what you wrote I expected that patch already tests builtins
> > which doesn't. Applied patch and got different results. When I added
> > patch results are similar.
> 
> OK, I extended the benchmark to add the built-ins explicitly so that
> you don't need to apply the math.h inline patch first.
>
Thats also good general policy to include all strategies so we will see
that some is optimal on some architecture. 
 
> > Which still doesn't have to mean anything, only if you test a
> > application that frequently uses these you will get result without
> > doubt.
> 
> We don't have applications that uses these, but we can say without any
> doubt that they will show huge speedups if they do use these functions
> frequently or any math functions that use them a lot. Remainder() for
> example shows ~7% gain with the new inlines.
>
No, this is just microbenchmark as I and Joseph are telling you ten
times. There are lot factors that could change performance like need of
constants which you don't measure as you inline remainder. So your
speedup could be only illusory. 
 
> > Here a simple modification produces different results. One of many
> > objections is that by simply adding gcc will try to make branchless code
> > like converting that to res += 5 * (isnan(tmp)). So use more difficult
> > branch and and with following two I get __builtin_isinf lot slower.
> > 
> >     { double tmp = p[i] * 2.0;    \
> >        res += 3 * sin (tmp); if (func (tmp)) res += 3* sin (2 * tmp) ;} \
> > 
> >     { double tmp = p[i] * 2.0;    \
> >        if (func (tmp)) res += 3 * sin (2 * tmp) ;} \
> 
> So here are the results again for the original test and your 2 tests above:

Also which gcc are you using? There is problem that recent gcc started
optimize better libc inlines but not builtins.

> > > >From this it seems that __isinf_inl is slightly better than the builtin, but
> > > it does not show up as a regression when combined with sin or in the remainder
> > > test.
> > >
> > That doesn't hold generaly as remainder test it could be just caused by
> > isnan being slower than isinf.
> 
> No, the new isinf/isnan are both faster than the previous versions (some isinf
> calls were inlined as __isinf_ns, but even that one is clearly slower than the
> builtin in all the results). Remember once again this patch creates new inlines
> that didn't exist before as well as replacing existing inlines in GLIBC with
> even faster ones. The combination of these means it is simply an impossibility
> that anything could become slower.
> 
I got that isinf libc inline is faster so how do you explain it?

> > > Well I just confirmed the same gains apply to x64.
> > >
> > No, that doesn't confirm anything yet. You need to do more extensive
> > testing to get somewhat reliable answer and still you won't be sure.
> 
> No, this benchmark does give a very clear and reliable answer: everything
> speeds up by a huge factor.
> 
While that is true it isn't exactly what we asked. We asked what inline
is best one. So you need accurate benchmark to also measure that, not
rough one that could only tell that inlining helps.

> > I asked you to run on arm my benchmark to measure results of inlining.
> > I attached again version. You should run it to see how results will differ.
> 
> I did run it but I don't understand what it's supposed to mean, and I can't share
> the results. So do you have something simpler that shows what point you're trying
> to make? Or maybe you could add your own benchmark to GLIBC?
> 
It was previously explained in thread. Please read it again, I will
reply separately whats wrong with v2 benchmark.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]