[PATCH] Check for existence of the path before processing '..'

Christopher Faylor cgf-use-the-mailinglist-please@cygwin.com
Tue Jun 11 15:17:00 GMT 2013

On Tue, Jun 11, 2013 at 05:04:46PM +0200, Corinna Vinschen wrote:
>On Jun 11 10:20, Christopher Faylor wrote:
>> On Tue, Jun 11, 2013 at 05:08:13PM +0400, Fedin Pavel wrote:
>> > Hello!
>> >
>> > Some time ago i reported ability to access things like
>> >"/usr/nonexistent/..bin". I still had this problem and i tried my hands on
>> >fixing it.
>> > The patch works by checking the actual existence of the path before
>> >removing the last component from it. For performance reasons, only one check
>> >is done for things like "../..". Because, obviously, if "/foo/bar/baz"
>> >exists, then "/foo/bar" exists too. Also, the check is done only after some
>> >components have been added to the path. So, for example, current directory
>> >(obtained when processing relative paths), will not be checked.
>> > I tried to add a similar test also to normalize_win32_path() function,
>> >however this broke things like "cd /usr/src/..". For some reason, a POSIX
>> >version of the path (but with reversed slashes) is passed to this routine
>> >when expanding mount points, so, consequently, test for "\usr\src" using
>> >GetFileType() fails.
>> > I think it's ok, at least POSIX paths now behave in POSIX way. I have
>> >tested against performance, there is some loss (~0.2 seconds), but only for
>> >referencing '..'.
>> > With this patch i am able to compile the latest version of glibc with no
>> >problems.
>> You introduce a check_parent flag which is set every time a non-slash
>> character is found.  That doesn't seem right.  It seems like it should
>> be set whenever you see a slash.
>Indeed.  I moved setting check_parent before the while expression in
>the else branch instead and it still works.

I'll bet you wouldn't see much of a hit if you just got rid of the
check_parent flag entirely.

>> Also you are calling path_conv recursively.  I assume that is where you
>> are seeing a performance hit.
>I don't see how do this without calling path_conv, though.  You have to
>perform the full conversion on the parent path, with symlinks and
>everything to get the right result.

Yes, but it is a HUGE stack hit to call path_conv recursively here.

>However, I'm rather impressed by the low impact of this change.  I moved
>the check_parent setting so it's only set when a slash occurs, and then
>I made a couple of runs building coreutils.  As you know, GCC uses ..
>paths a lot.  The performance hit is almost unnoticable:  72.3 seconds
>without, 73.4 seconds with the patch.

If we are considering doing this, then couldn't we somehow just avoid
eliminating "/.." until after the path is fully parsed and then collapse
all of them in one final loop?  Also, don't we have the same problem for
foo/./bar?  We change that to foo/bar but foo may not exist.


