This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Support for Intel X1000
- From: "Kinsella, Ray" <ray dot kinsella at intel dot com>
- To: "dalias at libc dot org" <dalias at libc dot org>
- Cc: "carlos at redhat dot com" <carlos at redhat dot com>, "fweimer at redhat dot com" <fweimer at redhat dot com>, "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>
- Date: Wed, 20 May 2015 19:28:24 +0000
- Subject: Re: Support for Intel X1000
- Authentication-results: sourceware.org; auth=none
- References: <1431426490 dot 3246 dot 29 dot camel at intel dot com> <5552104C dot 1020806 at redhat dot com> <20150512152207 dot GW17573 at brightrain dot aerifal dot cx> <1431513937 dot 2622 dot 24 dot camel at intel dot com> <20150513170809 dot GY17573 at brightrain dot aerifal dot cx> <555397D0 dot 70808 at redhat dot com> <20150515012433 dot GC17573 at brightrain dot aerifal dot cx> <1432130053 dot 17726 dot 6 dot camel at intel dot com> <20150520141538 dot GV17573 at brightrain dot aerifal dot cx> <1432149285 dot 19514 dot 17 dot camel at intel dot com>
ignore this - fault was being generated by the runtime linker on the
getchar.
Ray K
On Wed, 2015-05-20 at 19:14 +0000, Kinsella, Ray wrote:
> On Wed, 2015-05-20 at 10:15 -0400, dalias@libc.org wrote:
> > On Wed, May 20, 2015 at 01:54:13PM +0000, Kinsella, Ray wrote:
> >
> > If this is true, it's a bug in the implementation of mlockall. The
> > whole point of memory locking is to prevent the need to allocate
> > memory at page fault time, which matters for multiple reasons,
> > including at least:
> >
> > 1. Real-time applications that can't accept the possible latency.
> > 2. Reasons related to commit charge/overcommit/OOM-killer.
>
> CoW pages may be a little different - as they are already resident for
> for read, with the page marked read-only. The fault is generated then on
> write, what is happening on the fault is a copy of the page is made and
> original page is then marked rw.
>
> I wrote a trivial test with a single variable in the data segment,
> measure the process page_faults, write to the variable and measure the
> page_faults again.
>
> static int volatile indicator = 0;
>
> int main(int count, char *argv) {
>
> mlockall(MCL_CURRENT);
>
> getchar();
>
> indicator = 1;
>
> getchar();
> }
>
> [rkinsell~]$ ps -o min_flt,maj_flt `pidof test`
> MINFL MAJFL
> 70 0
> [rkinsell~]$ ps -o min_flt,maj_flt `pidof test`
> MINFL MAJFL
> 71 0
>
> > > It clobbers the process the process state in an unrecoverable way. The
> > > problem doesn't affect the Kernel, as the Kernel doesn't page fault.
> >
> > In that case I think mlockall (with the above bug fixed) is a viable
> > kernel-side workaround.
>
> Think you have the right idea here - best approach may be to identify
> the bug in the Kernel - set_cpu_bug(c, X86_BUG_LOCK);
>
> and then automatically pre-populate the pages when CoW pages are
> requested (i.e. trigger the fault up-front).
>
> Ray K