Corrupted file name in Cygwin - does Cygwin do a silly rename if a file is open?

Cedric Blancher cedric.blancher@gmail.com
Sun Nov 24 07:32:00 GMT 2024


On Sat, 23 Nov 2024 at 17:47, Jeremy Drake <cygwin@jdrake.com> wrote:
>
> On Sat, 23 Nov 2024, Cedric Blancher via Cygwin wrote:
>
> > Good afternoon!
> >
> > Does Cygwin do a silly rename if a Cygwin file is open but gets
> > /bin/rm at the same time?
>
> Yes!  See function try_to_bin in winsup/cygwin/syscalls.cc:
>       /* Create unique filename.  Start with a dot, followed by "cyg"
>          transposed into the Unicode low surrogate area (U+dc00) on file
>          systems supporting Unicode (except Samba), followed by the inode
>          number in hex, followed by a path hash in hex.  The combination
>          allows to remove multiple hardlinks to the same file. */

That code is wrong.

bash -c 'printf ".\udc63\udc79\udc67#\n"' | iconv -f UTF-8
.iconv: illegal input sequence at position 1

334 RtlAppendUnicodeToString (&recycler,
335 (pc.fs_flags () & FILE_UNICODE_ON_DISK
336 && !pc.fs_is_samba ())
337 ? L".\xdc63\xdc79\xdc67" : L".cyg");

SAMBA is right to reject L".\xdc63\xdc79\xdc67", because it is not a
valid UTF-16 sequence. ReFS with validation, OpenZFS and so on will
all REJECT such file names, and neither can NFSv4 because file names
must be valid Unicode (even if nfsd would not validate then filesystem
being shared via nfsd will reject that).
So this can only work on ntfs, and only if it is not validating the
input UTF.16 sequence.

AFAIK FILE_UNICODE_ON_DISK means that the wchar_t sequences must be
valid UTF-16, and not just be a random sequence of 16bit values.

@Corinna Vinschen Could this sequence please be changed to a VALID
UTF-8 sequence, such as \u[fffc]\u[fffc]\u[fffc]? That might work with
SAMBA, ReFS, OpenZFS NFSv4, ...

Ced
-- 
Cedric Blancher <cedric.blancher@gmail.com>
[https://plus.google.com/u/0/+CedricBlancher/]
Institute Pasteur


More information about the Cygwin mailing list