Bug 3458 - posix_madvise(addr, len, POSIX_MADV_DONTNEED) discards data
Summary: posix_madvise(addr, len, POSIX_MADV_DONTNEED) discards data
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Ulrich Drepper
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-11-04 10:28 UTC by Nicholas Miell
Modified: 2016-05-08 14:12 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2007-02-18 13:56:07
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nicholas Miell 2006-11-04 10:28:54 UTC
POSIX describes the POSIX_MADV_DONTNEED parameter to posix_madvise as follows:

POSIX_MADV_DONTNEED
   Specifies that the application expects that it will not access the specified
range in the near future.

Linux describes and implements the MADV_DONTNEED parameter to madvise as follows:

MADV_DONTNEED
   Do not expect access in the near future.  (For the time being, the
application is finished with the given range, so the kernel can free resources
associated with it.) Subsequent accesses of pages in this range will succeed,
but will result either in re-loading of the memory contents from the underlying
mapped file (see mmap()) or zero-fill-on-demand pages for mappings without an
underlying file.

glibc transparently forwards calls to posix_madvise to madvise, which means that
POSIX conformant applications which use posix_madvise(addr, len,
POSIX_MADV_DONTNEED) will corrupt data.

Suggested fix: Implement posix_madvise as a small wrapper around madvise which
silently discards all calls using POSIX_MADV_DONTNEED, fails for values other
than POSIX_MADV_*, and forwards the remainder.
Comment 1 Nicholas Miell 2006-11-05 05:45:11 UTC
alternate suggestion: rename POSIX_MADV_DONTNEED to POSIX_MADV_DISCARD_NP
(keeping the same value), add a new POSIX_MADV_DONTNEED which is silently ignored.
Comment 2 Ulrich Drepper 2007-02-17 09:06:49 UTC
MADV_DONTNEED is nowadays described as this:

 *  MADV_DONTNEED - the application is finished with the given range,
 *              so the kernel can free resources associated with it.

Where does your second part of the description come from?  There has been
discussion about this on lkml but I didn't follow it.  What is the outcome?  Is
what you say indeed true and will remain true?
Comment 3 Nicholas Miell 2007-02-17 20:35:45 UTC
The second part of the description comes from the man-pages manual of madvise(2).

The kernel comment is as follows (from linux/mm/madvise.c):

/*
 * Application no longer needs these pages.  If the pages are dirty,
 * it's OK to just throw them away.  The app will be more careful about
 * data it wants to keep.  Be sure to free swap resources too.  The
 * zap_page_range call sets things up for refill_inactive to actually free
 * these pages later if no one else has touched them in the meantime,
 * although we could add these pages to a global reuse list for
 * refill_inactive to pick up before reclaiming other pages.
 *
 * NB: This interface discards data rather than pushes it out to swap,
 * as some implementations do.  This has performance implications for
 * applications like large transactional databases which want to discard
 * pages in anonymous maps after committing to backing store the data
 * that was kept in them.  There is no reason to write this data out to
 * the swap area if the application is discarding it.
 *
 * An interface that causes the system to free clean pages and flush
 * dirty pages is already available as msync(MS_INVALIDATE).
 */
static long madvise_dontneed(struct vm_area_struct * vma,
                             struct vm_area_struct ** prev,
                             unsigned long start, unsigned long end)


and my reading of the implementation of madvise_dontneed() is that the comment
is accurate.

I have no knowledge of any lkml discussions, but my quick search turned up
http://lkml.org/lkml/2006/1/16/105 -- which doesn't seem to have gone anywhere.
Comment 4 Ulrich Drepper 2007-02-21 19:04:12 UTC
I've added code to ignore POSIX_MADV_DONTNEED for now.  I'm not going to add a
new POSIX_MADV_ value.  It's non-standrd anyway so people can use madvise.
Comment 6 Sourceware Commits 2007-07-12 14:56:52 UTC
Subject: Bug 3458

CVSROOT:	/cvs/glibc
Module name:	libc
Branch: 	glibc-2_5-branch
Changes by:	jakub@sourceware.org	2007-07-12 14:56:42

Modified files:
	.              : ChangeLog 
	sysdeps/unix/sysv/linux: syscalls.list 
Added files:
	sysdeps/unix/sysv/linux: posix_madvise.c 

Log message:
	2007-02-21  Ulrich Drepper  <drepper@redhat.com>
	
	[BZ #3458]
	* sysdeps/unix/sysv/linux/posix_madvise.c: New file.
	* sysdeps/unix/sysv/linux/syscalls.list: Remove posix_madvise entry.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/ChangeLog.diff?cvsroot=glibc&only_with_tag=glibc-2_5-branch&r1=1.10362.2.47&r2=1.10362.2.48
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/sysdeps/unix/sysv/linux/posix_madvise.c.diff?cvsroot=glibc&only_with_tag=glibc-2_5-branch&r1=NONE&r2=1.2.6.1
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/sysdeps/unix/sysv/linux/syscalls.list.diff?cvsroot=glibc&only_with_tag=glibc-2_5-branch&r1=1.127&r2=1.127.2.1