Performance optimization in av::fixup - use buffered IO, not mapped file

Ryan Johnson
Wed Dec 12 20:09:00 GMT 2012

On 12/12/2012 2:21 PM, Christopher Faylor wrote:
> On Wed, Dec 12, 2012 at 12:11:46PM -0500, Ryan Johnson wrote:
>> On 12/12/2012 12:03 PM, Christopher Faylor wrote:
>>> On Tue, Dec 11, 2012 at 07:13:04PM -0800, Daniel Colascione wrote:
>>>> On 12/11/2012 5:06 PM, Daniel Colascione wrote:
>>>>> On 12/10/2012 7:51 PM, Daniel Colascione wrote:
>>>>>> The key to generating a binary that repros the problem is to unexec emacs, then
>>>>>> try to repro with that generated binary, not a copy of it.
>>>>> The real explanation is a lot simpler: the binary is sparse. When you create a
>>>>> file mapping object for a sparse file, Windows discards all cached pages for
>>>>> that file. It makes sense that compilers (and Emacs unexec) would create sparse
>>>>> files as they seek around inside their outputs.
>>>> Anyway, the binary is sparse because our linker produces sparse files.
>>>> Would the Cygwin developers accept this patch? With it, applications would need
>>>> to explicitly use ftruncate to make files sparse. Considering the horrible and
>>>> unexpected performance implications of sparse files, I don't think generating
>>>> them automatically from a sequence of seeks and writes is the right thing to do.
>>> I don't know if this was already done (don't see it in a quick glance at
>>> the archives) but, if this is just a simple case of executable files
>>> being sparse, it seems like an obvious optimization would be to just to
>>> do a, e.g.,
>>> cp --sparse=never -p foo.exe foo.exe.tmp
>>> mv foo.exe.tmp foo.exe
>>> Wouldn't that remove the sparseness and wouldn't you see astounding
>>> performance improvments as a result?
>> Nope. You'd have to rm foo.exe first.
> "mv" does that automatically, doesn't it?
Oops. Thinko. I had "cp" in my head.

>> Doing so fixes the problem nicely, though, as you suggest.
>>> I don't think we should be considering ripping code out of Cygwin
>>> without some actual data to back up claims.  Testing something like the
>>> above should make it easier to justify.
>>> I'm actually rather surprised that setup.exe's tar code would maintain an
>>> executable's sparseness.
>> Setup is fine. It's home-brew stuff that suffers, unless/until invoking
>> `make install' copies the sparse file to its final destination, losing
>> the sparse property along the way.
>> Personally, I'm still in shock that the loader barfs so badly over
>> sparse files... normal reads via mmap and fread use the fs cache just fine.
> If we're talking about the loader then why are the examples I asked for
> using "cp"?  That's not really apples-to-apples.
I think I'm miscommunicating something here... let me try again.


Attempts to execute a file with the NTFS "sparse" attribute force the 
Windows loader to bypass the fs cache and fetch the file from disk. Long 
start times result, and are a pain if the executable is invoked 
frequently. The problem gets worse if you fill in all the holes, because 
it's still "sparse" but now has more bytes on disk to fetch. This is 
arguably a bug in Windows.

Steps to repro:

1. Arrange for the creation of a sparse file
2. Execute said file, and enjoy the delay while the loader bypasses the 
file cache to get the data directly from disk

"cp" with its "--sparse" option is merely an easy way to accomplish step 
#1. Step #2 is where all the fun happens.

STC attached. Output on my machine is below.

Workaround: copy the file to strip the flag, disable the sparse file 
optimization in lseek(), or replace lseek/write pairs that write out 
executable files with calls to pwrite (which apparently lacks the 

$ ./ 2>&1 | grep '\(real\|sparse\)'
non-sparse original runs quickly
real    0m0.078s
sparse copy can be read quickly from fs cache
real    0m0.036s
sparse copy slow-to-run in spite of being cached
real    0m2.969s
sparse copy no longer fully cached
real    0m0.911s
filling all holes makes sparse penalty worse
real    0m5.289s


-------------- next part --------------
rm -f foo

echo non-sparse original runs quickly
cat $(which emacs-nox) >/dev/null
time $(which emacs-nox) -Q --batch --eval '(kill-emacs)'

echo sparse copy can be read quickly from fs cache
cp --sparse=always $(which emacs-nox) foo
cat foo >/dev/null
time cat foo >/dev/null

echo sparse copy slow-to-run in spite of being cached
time ./foo -Q --batch --eval '(kill-emacs)'

echo sparse copy no longer fully cached
time cat foo >/dev/null

echo filling all holes makes sparse penalty worse
cp --sparse=never $(which emacs-nox) foo
cat foo >/dev/null
time ./foo -Q --batch --eval '(kill-emacs)'

More information about the Cygwin-developers mailing list