cygwin_conv_path POSIX->WIN_A conversion

Andy Koppe
Fri Nov 18 12:48:00 GMT 2011

On 18 November 2011 11:16, Corinna Vinschen wrote:
> On Nov 18 10:57, Andy Koppe wrote:
>> On 14 November 2011 10:33, Corinna Vinschen wrote:
>> > On Nov 11 09:27, Eric Blake wrote:
>> >> On 11/11/2011 07:46 AM, Corinna Vinschen wrote:
>> >> > So I was wondering if the CCP_POSIX_TO_WIN_A function shouldn't be
>> >> > changed so that it converts the pathname to the current ANSI or OEM
>> >> > charset instead, depending on the value returned by the AreFileApisANSI
>> >> > function.
>> >>
>> >> Yes, that sounds right to me,
>> >>
>> >> >
>> >> > I think this would be more correct than converting to the current Cygwin
>> >> > multibyte charset.  The downside is, that this *might* break backward
>> >> > compatibility.  However, if an application converts a Cygwin POSIX path
>> >> > to a native Windows multibyte path, isn't it always for the sake of
>> >> > calling a Win32 ANSI function or to submit the path to a native Windows
>> >> > application?
>> >>
>> >> Precisely for this reason - the only sane reason to convert to native is
>> >> to use the resulting string in native calls.
>> >
>> > I'm just worried that this would open a can of worms.
>> >
>> > If CCP_POSIX_TO_WIN_A always converts to ANSI/OEM, shouldn't
>> > CCP_WIN_A_TO_POSIX always convert from ANSI/OEM?
>> Yes, I think so.
>> > However, if the DOS
>> > path has been entered on the Cygwin command line, it will very likely
>> > not be given in the current ANSI/OEM CP, but rather in the Cygwin
>> > charset.
>> A program that assumes something other than the Cygwin charset for
>> command line arguments is buggy.
>> Having said that, I assume the concern here is about pre-1.7 programs,
>> where assuming the ANSI/OEM codepage for command line arguments would
>> have been reasonable. However, such programs won't actually be using
>> CCP_POSIX_TO_WIN_A and CCP_WIN_A_TO_POSIX, since those were only
>> introduced with 1.7. Instead, they'll be using the deprecated
>> cygwin_conv_to_posix_path() and its relatives.
>> I understand those currently do the same as their cygwin_conv_path_t
>> equivalents, but that doesn't have to be that way. So how about if
>> those legacy functions keep current behaviour in an attempt to
>> maximise backward compatibility,
> But to maximize backward compatibility, they should use ANSI/OEM,
> too, shouldn't they?

Hmm, indeed. When programs use the legacy conversion functions to
interface with Windows ANSI APIs they would want the ANSI/OEM codepage
to be used.

But that conflicts with the use case you cited where a program is
converting a Windows path it got via the Cygwin command line. That's
part of a wider issue though: on 1.5, they could have passed such a
path straight to a Windows ANSI API, whereas on 1.7 it needs to be
converted from the Cygwin charset first. There's nothing that can be
done about that one, short of changing the program (at which point
adapting to Unicode APIs would be the sensible thing to do).

So yeah, I'd go with ANSI/OEM for all the conversion functions
actually, and accept that Windows paths on the command line need to be
handled with more care on 1.7.

(That reminds me again of the mkshortcut overhaul that I never got
round to completing ...)


More information about the Cygwin-developers mailing list