This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

RE: [PATCH v2] Add math-inline benchmark

From: "Wilco Dijkstra" <wdijkstr at arm dot com>
To: 'OndÅej BÃlka' <neleai at seznam dot cz>
Cc: "Carlos O'Donell" <carlos at redhat dot com>, "'Siddhesh Poyarekar'" <siddhesh at redhat dot com>, "'GNU C Library'" <libc-alpha at sourceware dot org>
Date: Tue, 21 Jul 2015 18:46:38 +0100
Subject: RE: [PATCH v2] Add math-inline benchmark
Authentication-results: sourceware.org; auth=none
References: <002001d0bfb8$b36fa330$1a4ee990$ at com> <55A907F6 dot 8000504 at redhat dot com> <20150717143406 dot GG19592 at spoyarek dot pnq dot redhat dot com> <002c01d0c30a$8f23c600$ad6b5200$ at com> <55AD4AC6 dot 5070401 at redhat dot com> <20150721113502 dot GA3415 at domone> <003101d0c3d0$b9fb4350$2df1c9f0$ at com> <20150721170416 dot GA2294 at domone>

> OndÅej BÃlka wrote:
> On Tue, Jul 21, 2015 at 05:17:23PM +0100, Wilco Dijkstra wrote:

> > Fpclassify uses the builtin, but I think it's better to remove these
> > tests as they don't add much value. It's pretty obvious it is better
> > to use separate tests instead of fpclassify when you only need 2 of them.
> >
> I agree that its better to remove them.
> 
> However it isn't obvious that separate tests are better as its wrong. It
> should be same code as gcc should optimize that to better but often
> isnt. When I looked to assembly of my benchmark of noinline tests then
> performance difference was caused just by gcc bug. In fpclassify case it
> increased and decreased stack pointer without writing anything to stack.

Sure optimization makes a difference to these benchmarks, that's unavoidable,
so that's why I always carefully check the generated assembly to ensure I
really measure what I wanted to measure.

> And its easy to find cases where opposite is true as this depends how
> you write benchmark, for example if I change my benchmark to use following:
> 
> extern __always_inline int
> __fpclassify_test1 (double d)
> {
>   int cl = fpclassify (d);
>   return (cl == FP_INFINITE) ? 3 * sin (d) :
>          ((cl == FP_NAN) ? 2 * cos (d) : 0) ;
> }
> 
> extern __always_inline int
> __fpclassify_test2 (double d)
> {
>   return (__builtin_isinf(d)) ? 3 * sin (d) :
>          ((__builtin_isnan(d)) ? 2 * cos (d) : 0) ;
> 
> }
> 
> then fpclassify becomes lot faster.
> 
>    "__fpclassify_test1_t": {
>     "normal": {
>      "duration": 5.91344e+06,
>      "iterations": 500,
>      "mean": 11826
>     }
>    },
>    "__fpclassify_test2_t": {
>     "normal": {
>      "duration": 7.45868e+06,
>      "iterations": 500,
>      "mean": 14917
>     }
>    },

Well that doesn't seem correct. Here are my results for the above
using latest GCC and modern x64 hardware:

   "__fpclassify_test1_t": {
    "normal": {
     "duration": 1.99627e+06,
     "iterations": 500,
     "mean": 3992
    }
   },
   "__fpclassify_test2_t": {
    "normal": {
     "duration": 1.9893e+06,
     "iterations": 500,
     "mean": 3978
    }
   },

> > > Also you don't test inlines in isnormal:
> > >
> > > > > +/* Explicit inline similar to existing math.h implementation.  */
> > > > > +
> > > > > +#define __isnormal_inl(X) (__fpclassify (X) == FP_NORMAL)
> >
> > This is an inline and exactly what the current math.h does. The goal of my
> > benchmark is comparing the existing implementation with the new inlines.
> >
> Then yqu should compare existing implementation with new inlines. Now
> both would just call fpclassify without your patch.

OK, I'll add a builtin version for fpclassify as well (and use that in
__isnormal_inl2) - then all tests have the non-inline version, an internal 
GLIBC inline if it exists, the builtin and the math.h version.

Wilco

Follow-Ups:
- Re: [PATCH v2] Add math-inline benchmark
  - From: OndÅej BÃlka

References:
- [PATCH v2] Add math-inline benchmark
  - From: Wilco Dijkstra
- Re: [PATCH v2] Add math-inline benchmark
  - From: Carlos O'Donell
- Re: [PATCH v2] Add math-inline benchmark
  - From: Siddhesh Poyarekar
- RE: [PATCH v2] Add math-inline benchmark
  - From: Wilco Dijkstra
- Re: [PATCH v2] Add math-inline benchmark
  - From: Carlos O'Donell
- Re: [PATCH v2] Add math-inline benchmark
  - From: OndÅej BÃlka
- RE: [PATCH v2] Add math-inline benchmark
  - From: Wilco Dijkstra
- Re: [PATCH v2] Add math-inline benchmark
  - From: OndÅej BÃlka

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]