Filenames with Win32 special characters (or: Interix filename compatibility)

Corinna Vinschen
Tue Mar 11 10:11:00 GMT 2008


as you all know, Windows has more disallowed characters in filenames as
POSIX defines.  While POSIX allows any char except the slash, Windows
does not allow '\\', ':', '*', '?', '|', '<' and '>' as part of a

So far the only chance to define these characters as part of a filename
in Cygwin is a managed mount.

Now that we access files always using UNICODE anyway, we could use the
Interix approach, which works very neat:

All disallowed characters are in the ASCII area, 0 < c < 0x7f.  This is
still the case for the UNICODE representation.  What Interix does is, it
adds 0xf000 to the character value of the disallowed chars.  This
transformes disallowed chars to valid chars within the UNICODE block
95, "Private Use Area".  When reading filenames, all characters with a
value of 0xf0xx are simply transformed back with "wc &= 0xff".

This works on NTFS as well as on FAT, and it creates filenames which
look funny in the GUI, but are allowed and fully usable in the Win32
namespace, including filenames with trailing dots and spaces(*).

In Cygwin you would suddenly have transparent POSIX filenames, plus the
additional advantage of filename compatibility with Interix.  It would
also remove the requirement for managed mounts.

I really like this method.  I have a local patch which would introduce
this behaviour into Cygwin.  The only character which currently can't be
transformed is the backslash, which would require too many changes for
now.  If there's interest, I'd apply the patch for testing.


(*) I tried with the dumbest possible tool: Notepad.

Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

More information about the Cygwin-developers mailing list