Filenames with Win32 special characters (or: Interix filename compatibility)

Brian Dessent
Tue Mar 11 16:55:00 GMT 2008

Corinna Vinschen wrote:

> We could enhance the method to handle uppercase ASCII chars as well.
> Managed mounts could use the same method as normal mounts, just with
> upper case ASCII chars transformed, too.
> This would have the additional advantage that filenames on managed
> mounts not only look almost normal, the length of the real path
> also isn't changed due to the char transformation, like it is today.

Interesting.  The unchanged length sounds nice, but I'm not sure I
follow about looking almost normal.  Any filename with uppercase
characters would still look unintelligible in Explorer/any ANSI Win32
app, wouldn't it?

Here's an alternative idea for the encoding.  What if we encode upper
case letters as themselves plus a rare combining entity?  For example,
there's a block U+FE00 - U+FE0F called simply VARIATION SELECTOR-1


Well crap, those don't work very well, they display as boxes rather than
combining.  But going through the entire list of combining characters, I
did find one with an interesting property: U+0331: COMBINING MACRON
BELOW.  When displayed in Explorer, it looks like the normal letter with
a small underline.  But the neat property of this character is that when
converted from Unicode to cp1252 it converts to the underscore, meaning
stupid ANSI programs can still edit/open/save these files.  So we'd
encode uppercase ascii as simply 'A' -> "A\x0331", 'B' -> "B\x0331" and
so on.  It doesn't have the property of the same length, but they still
remain intelligible in dumb apps.

(BTW, for a real hoot try creating a filename containing U+034F


More information about the Cygwin-developers mailing list