Performance optimization in av::fixup - use buffered IO, not mapped file

Eric Blake eblake@redhat.com
Wed Dec 12 14:04:00 GMT 2012


On 12/12/2012 06:22 AM, Corinna Vinschen wrote:
> On Dec 12 06:11, Eric Blake wrote:
>> On 12/11/2012 08:13 PM, Daniel Colascione wrote:
>>> Anyway, the binary is sparse because our linker produces sparse files.
>>>
>>> Would the Cygwin developers accept this patch? With it, applications would need
>>> to explicitly use ftruncate to make files sparse.
>>
>> Eww.  That would be a regression for coreutils, [...]
> 
> 
> Really?  How so?

When using 'cp --sparse=always', coreutils relies on lseek() to create
sparse files.  Removing this code from cygwin would mean that coreutils
now has to be rewritten to explicitly ftruncate() instead of lseek() for
creating sparse files.

> 
>>> Considering the horrible and
>>> unexpected performance implications of sparse files, I don't think generating
>>> them automatically from a sequence of seeks and writes is the right thing to do.
>>
>> Why can't we instead use posix_fallocate() as a means of identifying a
>> file that must not be sparse, and then just patch the compiler to use
>> posix_fallocate() to never generate a sparse executable (but let all
>> other sparse files continue to behave as normal)?
> 
> posix_fallocate is not allowed to generate sparse files, due to the
> following restriction:
> 
>   "If posix_fallocate() returns successfully, subsequent writes to the
>   specified file data shall not fail due to the lack of free space on
>   the file system storage media."
> 
> See
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_fallocate.html
> 
> Therefore only ftruncate and lseek potentially generate sparse files.
> 
> On second thought, I don't quite understand what you mean by "use
> posix_fallocate() as a means of identifying a file that must not be
> sparse".  Can you explain, please?

Since we know that an executable must NOT be sparse in order to make it
more efficient with the Windows loader, then gcc should use
posix_fallocate() to guarantee that the file is NOT sparse, even if it
happens to issue a sequence of lseek() that would default to making it
sparse without the fallocate.

In other words, I'm proposing that we delete nothing from cygwin1.dll,
and instead fix the problem apps (gcc, emacs unexec) that actually
create executables, so that the files they create are non-sparse because
we have proven that they should not be sparse for performance reasons.
Meanwhile, all non-executable files (such as virtual machine disk
images, which are typically much bigger than executables, and where
being sparse really does matter) do not have to jump through extra hoops
of using ftruncate() when plain lseek() would do to keep them sparse.

Oh, and while I'm thinking about it, it would be nice to copy Linux'
fallocate(FALLOC_FL_PUNCH_HOLE) for punching holes into already-existing
files, rather than only being able to create holes by sequentially
building a file with each new hole possible only as the file size is
extended.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 619 bytes
Desc: OpenPGP digital signature
URL: <http://cygwin.com/pipermail/cygwin-developers/attachments/20121212/82b1d908/attachment.sig>


More information about the Cygwin-developers mailing list