This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 4/4] Remove broken posix_fallocate, posix_falllocate64 fallback code [BZ#15661]
- From: Florian Weimer <fweimer at redhat dot com>
- To: Mark Wielaard <mjw at redhat dot com>
- Cc: Roland McGrath <roland at hack dot frob dot com>, Rich Felker <dalias at libc dot org>, Paul Eggert <eggert at cs dot ucla dot edu>, libc-alpha at sourceware dot org, "Carlos O'Donell" <carlos at redhat dot com>
- Date: Tue, 26 May 2015 11:12:15 +0200
- Subject: Re: [PATCH 4/4] Remove broken posix_fallocate, posix_falllocate64 fallback code [BZ#15661]
- Authentication-results: sourceware.org; auth=none
- References: <20150424134516 dot 6795441F484D0 at oldenburg dot str dot redhat dot com> <554927F9 dot 7080509 at redhat dot com> <5549C097 dot 50505 at redhat dot com> <554A9A46 dot 2050806 at cs dot ucla dot edu> <20150506233055 dot GQ17573 at brightrain dot aerifal dot cx> <20150507181942 dot E71202C3B93 at topped-with-meat dot com> <554BB77D dot 1040805 at redhat dot com> <5559C23A dot 4070406 at redhat dot com> <1432289891 dot 4538 dot 47 dot camel at bordewijk dot wildebeest dot org>
On 05/22/2015 12:18 PM, Mark Wielaard wrote:
> On Mon, 2015-05-18 at 12:43 +0200, Florian Weimer wrote:
>> Another very recent example is here:
>>
>> https://lists.fedorahosted.org/pipermail/elfutils-devel/2015-May/004868.html
>>
>>> This suggests that people actually rely on the current allocation
>>> behavior. Combined with my previous analysis that applications will
>>> start to fail if we remove the fallback and return EINVAL, I now think
>>> we need to keep the allocation loop.
>
> I should point out that the above patch isn't in elfutils yet. It is
> waiting on how this discussion turns out.
>
> At the moment we simply use ftruncate. The problem that is solved by
> using posix_fallocate is that we are about the write to the memory of an
> mmapped file. Since ftruncate doesn't guarantee that the backing store
> is really there we risk getting a SIGBUS if the disk is full and we
> write to a memory area that hasn't been allocated yet. Since this is in
> library code, we cannot simply catch the SIGBUS. And we cannot use
> fallocate since that doesn't guarantee that the backing store is really
> allocated since it depends on whether the underlying file system support
> fallocate.
posix_fallocate does not guarantee this, either. See my patch with the
documentation update. Compression, COW, thin provision all can result
in ENOSPC. And obviously, there can be other I/O errors.
If you absolutely, truly need to use mmap, we need either have to
provide a way to intercept SIGBUS (perhaps à la SHEâthe technology is
there, it's just not available to C code in a deeply nested library
right now), or another mmap flag that prevents the kernel from sending
SIGBUS, and some way to tell if a mapping had been subject to write
errors (perhaps revive msync(MS_ASYNC)?).
> As far as I know posix_fallocate is the only way to guarantee
> that the file is fully allocated without spurious failures depending on
> where the file resides in the file system. And posix_fallocate has been
> available since forever with this functionality.
Thin provisioning etc. has become more common lately, so while
posix_fallocate did provide this functionality in the past, it doesn't
seem to do it right now.
--
Florian Weimer / Red Hat Product Security