Fw: File name too long problem -- maybe fix coming?
Corinna Vinschen
corinna-cygwin@cygwin.com
Mon Jan 7 15:05:00 GMT 2008
On Dec 29 18:15, Dave Korn wrote:
> On 19 December 2007 10:36, Corinna Vinschen wrote:
> > Maybe that goes without saying, after all this is an Open Source
> > project, but I could really need some help here. The progress is
> > extremly slow. There's just too much code to keep an eye upon.
> > When I started I imagined we could release 1.7.0 in 2007 but as long as
> > I have to do this conversion to unicode paths alone, it will take a lot
> > more months. 2008? Well, maybe...
>
> Is there an overall TODO list? Any notes/designs/specs/back-of-an-envelope
> sketches? Are you following an overall strategy to do the conversion?
It would be too much to say "yes" here. The whole plan is a (partly
diffuse) design idea in my mind. I will try to outline it.
For a start a few well-known facts, in order of appearance:
I. Long path names and unicode characters only work fine with the
Win32 fooW functions or by using NT native functions.
II. Long paths in Win32 speak start with \\?\ or \\?\UNC
Long paths in NT speak start with \??\ or \??\UNC
III.Long path names using the above syntax are obviously always
absolute paths. Since all other paths are restricted to
MAX_PATH == 260 chars in Win32, any relative path is restricted
to 260 chars as well.
IV. Talking about relative paths, NT native functions have the
additional advantage to allow directory handle relative paths.
I don't know if the 260 char restriction for relative Win32 paths
is also true for these native NT directory handle relative paths,
but I doubt it. That's something I didn't test so far, though.
My idea what should be done in Cygwin goes roughly like this:
1. POSIX paths should be handled in the current codepage as before.
Potentially this is a multibyte codepage like UTF-8. Make sure
that we handle multibyte paths correctly.
TBD: Always use UTF-8? What about existing installations with
symlinks/mount points using arbitrary codepages?
2. Windows paths should always be handled as wide char paths. The
most natural form is the OBJECT_ATTRIBUTES structure, because
only the OBJECT_ATTRIBUTES structure allows directory relative
paths to implement the openat family of functions correctly.
3. The path_conv class would ideally do the Windows path handling
in OBJECT_ATTRIBUTES structures, using native NT functions as much
as possible. Calling functions should request the path as
OBJECT_ATTRIBUTES using the path_conv::get_object_attr method
if possible. If it's necessary to call Win32 functions, there
is a path_conv::get_wide_win32_path method.
4. Path-related case insensitive comparisons should always be done
in wide char to avoid language problems. The only exception
should be comparisons against fixed strings with only ASCII
chars in it.
5. Right now, mount points are stored with the POSIX path as key name,
using the single- or multibyte charset of the current codepage.
This restricts the POSIX path length of mount points to 255 chars.
The native path is stored as value which doesn't have any length
restriction problem.
TBD: Keep as is, thus sticking to mount points <= 255 chars,
or inventing a mounts v3? Stop using the registry and use
files like /etc/fstab, ~/.fstab?
6. As for the TODO list: It's basically looking through the code
and convert what can be converted. I have no order for the
tasks. Which leads to the last point.
7. As Chris already mentioned, part of the problem is that path_conv
is not yet converted. It's not an easy job and I believe there's
still a lot to discuss, especially about the external interfaces to
path_conv and the handling of relative paths. One part of the
problem is that path_conv calls methods in various fhandlers, which
in turn have to be converted. It's quite tricky to convert one
without the other, and I admit that I trashed many lines of new code
last year when I found yet another chicken-egg situation. The
interlocking of the path handling functions is not always easy to
unlock, so I'm (and Chris is certainly as well) open to new ideas
and especially patch snippets. There's much code left which is
as yet untouched.
Anything I left out? Probably. Just ask.
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
More information about the Cygwin-developers
mailing list