This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [v3][PATCH] Framework for performance benchmarking of functions
- From: Siddhesh Poyarekar <siddhesh at redhat dot com>
- To: Richard Henderson <rth at twiddle dot net>
- Cc: Florian Weimer <fweimer at redhat dot com>, libc-alpha at sourceware dot org
- Date: Thu, 21 Feb 2013 10:52:56 +0530
- Subject: Re: [v3][PATCH] Framework for performance benchmarking of functions
- References: <20130108093115.GA27464@spoyarek.pnq.redhat.com><20130111065846.GC16859@spoyarek.pnq.redhat.com><511CA91C.6000306@redhat.com><20130220101719.GA26842@spoyarek.pnq.redhat.com><51257685.8000707@twiddle.net>
On Wed, Feb 20, 2013 at 05:21:09PM -0800, Richard Henderson wrote:
> The 1e9 constant needs to be cast to int64_t, lest this expression
> simply overflow on 32-bit hosts.
Right, thanks for catching that.
> There are quite a few hosts for which the resolution of this clock
> isn't good enough to make measuring in the inside loop work. Can we
> have a look at what clock_getres returns and perhaps measure the
> outside loop? At least then we'll get *some* sort of answer...
>
> E.g. ARM Cortex A9 can only measure at 1kHz, but Cortex A15-mp can
> measure at the cpu frequency (1.6GHz, reported by getres as 1ns).
OK, I'll do per-call measurements only for cpus that report resolution
as 1ns since anyhting worse and the faster math functions won't get
meaningful benchmarks.
I wonder if it makes sense to calibrate the benchmarks based on an
estimate of how long a clock_gettime takes, perhaps by doing
consecutive clock_gettime calls (multiple times) and seeing results.
Thanks,
Siddhesh