Cygwin Filesystem Performance degradation 1.7.5 vs 1.7.7, and methods for improving performance

Corinna Vinschen corinna-cygwin@cygwin.com
Sun Oct 3 19:53:00 GMT 2010


On Oct  3 18:15, Yoni Londner wrote:
> Hi,
> 
> Following is a cygwin stat performance testing appilication.
> 
> It compares Cygwin's stat() with Native NT.
> [...]
> - GFA: GetFileAttributes(): This NT API allows getting information

You meant NtQueryFullAttributesFile, apparently...

> on a file by name without having to open a handle to the file. This
> misses information such as st_ino and st_nlink, but is added here
> for sake of comparison. Theoretically this should have been the
> fastest API, but on Win7 it is actually slower that
> CreateFile(dir)+QueryDirectoryFile+CloseFile. GFA takes 0.01ms on XP
> and 0.16 on Win7 (strange...).

Not really.  If you use sysinternal's procmon, you'll see what happens.
On XP, the NtQueryFullAttributesFile is a simple FASIO_NETWORK_QUERY_OPEN.
On Windows 7, the FASTIO_NETWORK_QUERY_OPEN returns with FAST_IO_DISALLOWED
and the OS has to fall back to NtOpenFile/NtQueryInformationFile/NtClose.
Don't ask me why this occurs on W7.  Maybe FASTIO_NETWORK_QUERY_OPEN
interferes badly with NTFS transactions, I don't know.  I only know that
I never saw a FASTIO_NETWORK_QUERY_OPEN succeed on W7 so far.

> [...]
> And here are the results of my tests on /bin ~3500 files:
>    XP: CYGWIN_NT-5.1 yoni 1.7.5(0.228/5/3) 2010-07-18 14:53 i686 Cygwin
>    lstat(1.7.5 unpatched) 3587 files stat() 490.2ms, per file: 0.1367ms
>    lstat(1.7.5 patched) 3585 files stat() 78.12ms, per file: 0.02179ms
>    lstat(1.7.7 unpatched) 3588 files stat() 3570ms, per file: 0.9951ms
>    lstat(1.7.7 patched) 3588 files stat() 3374ms, per file: 0.9404ms
>    GFA 3585 files stat() 38.09ms, per file: 0.01062ms
>    QDF 3585 files stat() 105.5ms, per file: 0.02942ms
>    QIF 3585 files stat() 3309ms, per file: 0.9229ms
> 
>    Win7: CYGWIN_NT-6.1 bush-PC 1.7.5(0.228/5/3) 2010-07-18 14:53
> i686 Cygwin
>    lstat(1.7.5 unpatched) 3457 files stat() 934.1ms, per file: 0.2702ms
>    lstat(1.7.5 patched) 3455 files stat()  634ms, per file: 0.1835ms
>    lstat(1.7.7 unpatched) 3459 files stat()  777ms, per file: 0.2246ms
>    lstat(1.7.7 patched) 3459 files stat()  631ms, per file: 0.1824ms
>    GFA 3455 files stat()  574ms, per file: 0.1661ms
>    QDF 3455 files stat()  159ms, per file: 0.04602ms
>    QIF 3455 files stat()  599ms, per file: 0.1734ms

I'm missing a comparison using the latest Cygwin from CVS.  That's
much more interesting than 1.7.5.

However, it's easy to make a speed comparsion which ignores the bunch of
problems Cygwin is fighting against.  What about the problems of various
filesystems, for instance?  
  
You also never replied to my mail describing the suffix problem when
using the NtQueryDirectoryFile function:
http://cygwin.com/ml/cygwin-patches/2010-q3/msg00073.html

> - QDF() MUCH MUCH MUCH faster than QIF(35x faster on XP, 4x faster on Win7).

4 times slower only?  That's not too bad.

Dunno why XP is so slow.  Are you sure there's no BLODA kind of software
on your system which is to blame for the difference?  I'm only seldom
using XP, given that I test on W7 in the first place, but I don't
remember that XP was so slow.  Most of the time XP was the fastest of
the compared systems.

Well, except for the handling of long pathnames.  There's a bug in NTFS
in all NT 5.x kernels which results in a quadratic access time to files
relative to the length of the path.  I reported this bug upstream in
2008.  Unfortunately it got the "confirmed, won't fix" stamp, since
Vista had already been released, and the NT 6.X kernel didn't have this
problem anymore.

> The only information we 'loose' by using QDF() vs. QIF() is st_nlink.

No, that's incorrect.  You're also losing POSIX permissions since you
neglected to read the ACL.

> Regaring compatiblity for loadable filesystems on Windows (MVFS
> etc...): Since cygwin knows the volume information, it can use QDF()
> for NTFS and FAT volumes.

This requires at least one open handle to the volume to be sure to have
the right filesystem type.  So, whatever you do, you need to open a
handle first.  Either to the parent dir or to the file.  This is kind of
a chicken-egg problem.
You're also still neglecting permissions, or stuff like the missing
implementation of FileIdBothDirectoryInformation.

> for NTFS and FAT volumes ('white list'). Since these are implemented
> in NTFS.SYS and FAT.SYS - which we know are implemented correctly,
> there is no problem, and this covers 99% of the FS system calls.

I doubt that.  There are *lots* of people out there using Samba or
a NAS of whatever kind.

Anyway, I'm in the process of creating a symlink_info::xcheck function
which uses NtQueryDirectoryFile with a single filter expression with
wildcards.  The goal is to implement xstat in the first place.  Like cgf
I'm not yet sure it's really such a terribly good idea to allow people
to use that function via a $CYGWIN setting.  But that isn't set in stone.

However, I can work on this only next week and then I'm unavailable
until late November so this may take some time.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat



More information about the Cygwin-developers mailing list