This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 1/25] Remove nested functions: crypt/md5-crypt.c
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: "Joseph S. Myers" <joseph at codesourcery dot com>
- Cc: Siddhesh Poyarekar <siddhesh at redhat dot com>, Konstantin Serebryany <konstantin dot s dot serebryany at gmail dot com>, GNU C Library <libc-alpha at sourceware dot org>
- Date: Tue, 20 May 2014 17:20:31 +0200
- Subject: Re: [PATCH 1/25] Remove nested functions: crypt/md5-crypt.c
- Authentication-results: sourceware.org; auth=none
- References: <CAGQ9bdzqT1EyXYMwACrHpPU=vPjM_b72LJRjb7BW_OzJRXG3bw at mail dot gmail dot com> <20140520131314 dot GB14500 at spoyarek dot pnq dot redhat dot com> <CAGQ9bdw=-pzbi00gn5t_W8pXBjU0edFvLGMNUc+o=g9og=f9+Q at mail dot gmail dot com> <20140520144605 dot GE14500 at spoyarek dot pnq dot redhat dot com> <Pine dot LNX dot 4 dot 64 dot 1405201453390 dot 29858 at digraph dot polyomino dot org dot uk>
On Tue, May 20, 2014 at 02:54:41PM +0000, Joseph S. Myers wrote:
> On Tue, 20 May 2014, Siddhesh Poyarekar wrote:
>
> > On Tue, May 20, 2014 at 06:23:40PM +0400, Konstantin Serebryany wrote:
> > > Do you think such a benchmarks is possible at all?
> > > I've made a simple test (attached) and profiled it with "perf":
> > > 85.26% a.out libcrypt-2.15.so [.] _ufc_doit_r
> > > 6.13% a.out libcrypt-2.15.so [.] _ufc_mk_keytab_r
> > > 2.40% a.out libcrypt-2.15.so [.] _ufc_setup_salt_r
> > > 1.56% a.out libcrypt-2.15.so [.] __crypt_r
> > > 1.51% a.out libcrypt-2.15.so [.] _ufc_output_conversion_r
> > > 1.35% a.out libcrypt-2.15.so [.] crypt
> > >
> > > As you can see, crypt_r itself takes a tiny fraction of time,
> > > most of it is spent in _ufc_* which are defined in another module.
> > > Any changes in crypt itself (unless you do something insane) will not
> > > be observable in profile.
> >
> > You'll need to choose inputs so that __md5_crypt_r is called -
> > crypt/crypt-entry.c should help you with that. if __md5_crypt_r takes
> > a tiny fraction of time again, then this change should be safe, but I
> > would prefer that the inputs get fed into the benchtests anyway so
> > that we can automatically track performance of crypt for those inputs
> > in future.
>
> Another thing that can be done is to look at the code generated for the
> relevant file before and after the patch - I'd expect very little change.
>
Yes I wanted to write that benchmarking this change is probably
pointless as gcc should optimize them in same way (if there is
consistently difference in one way we could report that as gcc regression.)
Also perf data show that its a cold function so you should improve
readability not performance (measuring difference is possible but you
need to run that for few days and wait until standard deviation drops
sufficiently to get statistically significant result).
About only way this could make difference is if when it gets inlined in
one way and does not in other.
That is out of scope of microbenchmarks, instruction cache misses easily
could make a function slower in wild but you would measure that inlined
function is faster when you would call it in tight loop.
As actual change is concerned