This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] New module to import and process bench.out


On Tue, Jun 24, 2014 at 10:22:22AM +0100, Will Newton wrote:
> > +def mean(lst):
> > +    """Compute and return mean of numbers in a list
> > +
> > +    The pypy average function has horrible performance, so implement our
> > +    own mean function.
> 
> I'm not sure I really understand this comment. What is the relevance
> of pypy? How bad can the performance actually be? :-/

Sorry, that should be numpy, not pypy.  I guess the comment is not
necessary; I think I just vented my anger there ;) Reimplementing it
reduced computation from minutes to seconds, so yes, there is a fair
difference.  A simple list average can give an idea of how bad it is:

[15:30][siddhesh@spoyarek tmp ]$ time python <<EOF
import numpy
print(numpy.average(range(10000000)))
EOF
4999999.5

real    0m1.387s
user    0m1.248s
sys     0m0.131s
[15:30][siddhesh@spoyarek tmp ]$ time python <<EOF
l = range(10000000)
EOF                          

real    0m0.501s
user    0m0.358s
sys     0m0.140s
[15:30][siddhesh@spoyarek tmp ]$ time python <<EOF
l = range(10000000)
print(float(sum(l)) / len(l))
EOF
4999999.5

real    0m0.510s
user    0m0.376s
sys     0m0.129s

That's rather simplistic, but it should give you the idea - the custom
function hardly takes any time above the time required to build and
allocate the list and do any other sundry operations within the
interpreter.  numpy on the other hand takes a whole lot more time.  A
random list shows a lot of variance, presumably because it takes
variable time to build the list, but the custom function comes out on
top all the time.

> > +def compress_timings(points):
> > +    """Club points with close enough values into a single mean value
> > +
> > +    See split_list for details on how the clubbing is done.
> > +
> > +    Args:
> > +        points: The set of points.
> > +    """
> > +    do_for_all_timings(points, split_list)
> > +
> 
> Does it make sense to add these functions without a user? At the
> moment I find it quite difficult to understand what they are doing and
> an example would certainly help.

I thought it might demonstrate a use case, but I guess you're right,
it won't demonstrate a use case until a script is in place to
demonstrate it.  There is already such a script (see
siddhesh/benchmarks branch, although it may be outdated right now) so
I'll just remove this function from this patch and place it in the
patch where I post that script.

Thanks,
Siddhesh

Attachment: pgpLlWFbgPZKj.pgp
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]