This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] benchtests/Makefile: Run the string benchmarks four times by default.
- From: Will Newton <will dot newton at linaro dot org>
- To: Ondřej Bílka <neleai at seznam dot cz>
- Cc: libc-alpha <libc-alpha at sourceware dot org>, Patch Tracking <patches at linaro dot org>
- Date: Thu, 5 Sep 2013 08:51:53 +0100
- Subject: Re: [PATCH] benchtests/Makefile: Run the string benchmarks four times by default.
- Authentication-results: sourceware.org; auth=none
- References: <52274838 dot 7010902 at linaro dot org> <20130904161743 dot GA10358 at domone dot kolej dot mff dot cuni dot cz> <CANu=DmiVWFijri_iMjFGJEWdTWheHbBFOH8XULURRE8pLMkuLA at mail dot gmail dot com> <20130904165211 dot GA14906 at domone dot kolej dot mff dot cuni dot cz>
On 4 September 2013 17:52, OndÅej BÃlka <neleai@seznam.cz> wrote:
> On Wed, Sep 04, 2013 at 05:20:23PM +0100, Will Newton wrote:
>> On 4 September 2013 17:17, OndÅej BÃlka <neleai@seznam.cz> wrote:
>> > On Wed, Sep 04, 2013 at 03:48:24PM +0100, Will Newton wrote:
>> >>
>> >> The string benchmarks can be affected by physical page placement, so
>> >> running them multiple times is required to account for this. Also
>> >> backup the results of the previous run like is done for the other
>> >> benchmarks.
>> >>
>> > You do not need to do this. We should instead randomize addresses used
>> > which handles this problem.
>>
>> That seems like it would be considerably more complicated. Do you have
>> a reason why your approach is better?
>>
> It does not matter if it is complicated, it is required to get
> reasonable results.
How do you define "reasonable results"?
The intention of my patch - which I may have not made completely clear
in the commit message - is to improve test stability. What I mean by
this is that with a physically indexed cache the physical pages
allocated to the test can have a significant effect on the performance
at large (e.g. cache size / ways and above) buffer sizes and this will
cause variation when running the same test multiple times. My aim is
to average out these differences as it is hard to control for them
without understanding the details of the cache subsystem of the system
you are running on.
Your test appears to be addressing concerns of test validity by
running a wider range of buffer alignments, which is an important but
separate concern IMO.
--
Will Newton
Toolchain Working Group, Linaro