This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Mon, Jun 09, 2014 at 04:14:35PM +0100, Will Newton wrote: > > A maximum of 32K only tests arena allocation performance. This is > > fine for now since malloc+mmap performance is as interesting. What is <snip> > > There's at least two axes we are interested in - how performance > scales with the number of threads and how performance scales with the > allocation size. For thread performance (which this benchmark is > about) the larger allocations are not so interesting - typically their > locking overhead is in the kernel rather than userland and in terms of > real world application performance its just not as likely to be a > bottleneck as small allocations. We have to be pragmatic in which > choices we make as the full matrix of threads versus allocation sizes > would be pretty huge. Heh, I noticed my typo now - I meant to say that malloc+mmap performance is *not* as interesting :) > So I guess I should probably also write a benchmark for allocation > size for glibc as well... Yes, it would be a separate benchmark and probably would need some specific allocation patterns rather than random sizes. Of course choosing allocation patterns is not going to be easy. > > Mark as const. > > Ok, although I don't believe it affects code generation. Right, it's just pedantry. > > I don't know how useful max_rss would be since we're only doing a > > malloc and never really writing anything to the allocated memory. > > Smaller sizes may probably result in actual page allocation since we > > write to the chunk headers, but probably not so for larger sizes. > > Yes, it is slightly problematic. What you probably want to to do is > zero all the memory and measure RSS at that point but it would slow > down the benchmark and spend lots of time in memset instead. At the > moment it tells you how many pages are taken up by book-keeping but > not how many of those pages your application would touch anyway. Oh I didn't mean to imply that we zero pages and try to get a more accurate RSS value. My point was that we could probably just do away with it completely because it doesn't really tell us much - I can't see how pages taken up by book-keeping would be useful. However if you do want to show resource usage, then address space usage (VSZ) might show scary numbers due to the per-thread arenas, but they would be much more representative. Also, it might be useful to see how address space usage scales with threads, especially for 32-bit. > No I haven't looked into that, so far I have been treating malloc as a > black box and I'm hoping not to tailor teh benchmark too far to one > implementation or another. I agree that the benchmark should not be tailored to the current implementation, but then this behaviour would essentially be another set of inputs. Simply increasing the maximum size from 32K to about 128K (that's the initial threshold for mmap anyway) might result in that behaviour being triggered more frequently. > I'll rework the patches and hopefully get a graphing script to go > with it... Thanks! I have marked this patch as Accepted in patchwork as I think it could go in as an initial revision for the test with nits fixed, so you can push the benchmark and then work on improvements to it. Or you can do your improvements and post a new version - your choice. Siddhesh
Attachment:
pgpQsCwtUHARB.pgp
Description: PGP signature
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |