This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] [RFC] malloc: Reduce worst-case behaviour with madvise and refault overhead

From: Mel Gorman <mgorman at suse dot de>
To: Julian Taylor <jtaylor dot debian at googlemail dot com>
Cc: libc-alpha at sourceware dot org
Date: Fri, 13 Feb 2015 16:29:42 +0000
Subject: Re: [PATCH] [RFC] malloc: Reduce worst-case behaviour with madvise and refault overhead
Authentication-results: sourceware.org; auth=none
References: <54DCE9B6 dot 5010701 at googlemail dot com> <20150213131027 dot GA23372 at suse dot de> <54DE1B6A dot 3030608 at googlemail dot com>

On Fri, Feb 13, 2015 at 04:42:34PM +0100, Julian Taylor wrote:
> On 02/13/2015 02:10 PM, Mel Gorman wrote:
> > On Thu, Feb 12, 2015 at 06:58:14PM +0100, Julian Taylor wrote:
> >>> On Mon, Feb 09, 2015 at 03:52:22PM -0500, Carlos O'Donell wrote:
> >>>> On 02/09/2015 09:06 AM, Mel Gorman wrote:
> >>>>> while (data_to_process) {
> >>>>> 	buf = malloc(large_size);
> >>>>> 	do_stuff();
> >>>>> 	free(buf);
> >>>>> }
> >>>>
> >>>> Why isn't the fix to change the application to hoist the
> >>>> malloc out of the loop?
> >>>
> >>> I understand this is impossible for some language idioms (typically
> >>> OOP, and despite my personal belief that this indicates they're bad
> >>> language idioms, I don't want to descend into that type of argument),
> >>> but to me the big question is:
> >>>
> >>> Why, when you have a large buffer -- so large that it can effect
> >>> MADV_DONTNEED or munmap when freed -- are you doing so little with it
> >>> in do_stuff() that the work performed on the buffer doesn't dominate
> >>> the time spent?
> >>>
> >>> This indicates to me that the problem might actually be significant
> >>> over-allocation beyond the size that's actually going to be used. Do
> >>> we have some real-world specific examples of where this is happening?
> >>> If it's poor design in application code and the applications could be
> >>> corrected, I think we should consider whether the right fix is on the
> >>> application side.
> >>>
> >>
> >>
> >> I also ran into this issue numerous times, also filed a bug:
> >> https://sourceware.org/bugzilla/show_bug.cgi?id=17195
> >>
> > 
> > Thanks for pointing that out. I read the report and the original report
> > and do not understand why it was considered a duplicate. They are
> > completely different issues.
> > 
> >> As a real world example I have higher level numerical software.
> >> E.g. in python numpy you write code like this:
> >> a = b+ c +d
> >> where these are large arrays due to limitations of the library and
> >> python this involves allocating multiple large arrays while the
> >> operations on the memory itself is very small.
> > 
> > Is there any chance you could supply a simple test case in python for
> > this? Your description is straight-forward and I suspect the resulting
> > script will be just a few lines long but I want to be sure I see the
> > same problem.
> 
> sure, you easily can construct many cases where you see this problem
> with python numpy, e.g. a particular bad one that caused me to file the bug:
> 
> import numpy as np
> def f():
>     d = np.arange(1000000.) / 2
>     d[::10] = np.nan
>     c2 = ~np.isnan(d)
>     for needle in range(1000):
>         d[c2]
> 
> import threading
> t = [threading.Thread(target=f) for x in range(2)]
> for x in t:
>     x.start()
> for x in t:
>     x.join()
> 

Thanks. I confirmed locally that this calls madvise 2000 times a second
and spends a bit over 50% of the time in the kernel. This is not an
unreasonable code pattern in an application. I don't have a test machine
available to confirm that V2 potentially fixes the problem but based on
strace, I'm expecting that export MALLOC_TRIM_THRESHOLD_=10485760 would
avoid calling madvise repeatedly. The default behaviour will still suck
but at least there would be a tuning option.

> <SNIP>
> 
> The minimal openmp case in the bug #17195 is reduced from this python
> testcase.
> 

Indeed. The test case in the changelog is also a variant of what's in
bug 17195. In essense, it's the problem that ebizzy also hits. There is
no problem reproducing this.

> > 
> > Alternatively, would you be in the position to test v2 of this patch and
> > see if the performance of your application can be adressed by tuning trim
> > threshold to a high value?
> > 
> 
> I can give it a try. Though the openmp testcase from the bug should be
> the same problem and you can hopefully try that yourself.

I easily can but the first version got hit with the "does anything in
the real world care?" hammer. Any artifical test case is vunerable to
the same feedback. The python snippet is also artifical but it's a bit
harder to wave away as being either unreasonable code or fixable through
other means.

Thanks.

-- 
Mel Gorman
SUSE Labs

Follow-Ups:
- Re: [PATCH] [RFC] malloc: Reduce worst-case behaviour with madvise and refault overhead
  - From: Julian Taylor

References:
- Re: [PATCH] [RFC] malloc: Reduce worst-case behaviour with madvise and refault overhead
  - From: Julian Taylor
- Re: [PATCH] [RFC] malloc: Reduce worst-case behaviour with madvise and refault overhead
  - From: Mel Gorman
- Re: [PATCH] [RFC] malloc: Reduce worst-case behaviour with madvise and refault overhead
  - From: Julian Taylor

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]