[Ms-nfs41-client-devel] Corrupted file name in Cygwin - does Cygwin do a silly rename if a file is open?

Cedric Blancher cedric.blancher@gmail.com
Mon Nov 25 15:21:19 GMT 2024


On Mon, 25 Nov 2024 at 12:41, Roland Mainz <roland.mainz@nrubsig.org> wrote:
>
> On Sun, Nov 24, 2024 at 8:32 AM Cedric Blancher
> <cedric.blancher@gmail.com> wrote:
> >
> > On Sat, 23 Nov 2024 at 17:47, Jeremy Drake <cygwin@jdrake.com> wrote:
> > >
> > > On Sat, 23 Nov 2024, Cedric Blancher via Cygwin wrote:
> > >
> > > > Good afternoon!
> > > >
> > > > Does Cygwin do a silly rename if a Cygwin file is open but gets
> > > > /bin/rm at the same time?
> > >
> > > Yes!  See function try_to_bin in winsup/cygwin/syscalls.cc:
> > >       /* Create unique filename.  Start with a dot, followed by "cyg"
> > >          transposed into the Unicode low surrogate area (U+dc00) on file
> > >          systems supporting Unicode (except Samba), followed by the inode
> > >          number in hex, followed by a path hash in hex.  The combination
> > >          allows to remove multiple hardlinks to the same file. */
> >
> > That code is wrong.
> >
> > bash -c 'printf ".\udc63\udc79\udc67#\n"' | iconv -f UTF-8
> > .iconv: illegal input sequence at position 1
> >
> > 334 RtlAppendUnicodeToString (&recycler,
> > 335 (pc.fs_flags () & FILE_UNICODE_ON_DISK
> > 336 && !pc.fs_is_samba ())
> > 337 ? L".\xdc63\xdc79\xdc67" : L".cyg");
> >
> > SAMBA is right to reject L".\xdc63\xdc79\xdc67", because it is not a
> > valid UTF-16 sequence. ReFS with validation, OpenZFS and so on will
> > all REJECT such file names, and neither can NFSv4 because file names
> > must be valid Unicode (even if nfsd would not validate then filesystem
> > being shared via nfsd will reject that).
> > So this can only work on ntfs, and only if it is not validating the
> > input UTF.16 sequence.
> >
> > AFAIK FILE_UNICODE_ON_DISK means that the wchar_t sequences must be
> > valid UTF-16, and not just be a random sequence of 16bit values.
> >
> > @Corinna Vinschen Could this sequence please be changed to a VALID
> > UTF-8 sequence, such as \u[fffc]\u[fffc]\u[fffc]? That might work with
> > SAMBA, ReFS, OpenZFS NFSv4, ...
>
> That does not help with existing Cygwin installations and Cygwin
> 32bit, which is stuck at Cygwin 3.3.x ... ;-(
>
> I agree that the L".\xdc63\xdc79\xdc67" prefix will backfire on
> something like ReFS, OpenZFS etc (SAMBA uses the prefix for
> filesystems which do NOT have |FILE_UNICODE_ON_DISK| set), but for
> ms-nfs41-client I just stomp over the issues with this patch (wording
> still needs to be improved):
> ---- snip ----
> diff --git a/daemon/setattr.c b/daemon/setattr.c
> index 9eaafb5..6e9729e 100644
> --- a/daemon/setattr.c
> +++ b/daemon/setattr.c
> @@ -284,6 +284,46 @@ static int handle_nfs41_rename(void
> *daemon_context, setattr_upcall_args *args)
>
>     EASSERT((rename->FileNameLength%sizeof(WCHAR)) == 0);
>
> +#define CYGWIN_STOMP_SILLY_RENAME_INVALID_UTF16_SEQUENCE 1
> +
> +#ifdef CYGWIN_STOMP_SILLY_RENAME_INVALID_UTF16_SEQUENCE
> +    /*
> +     * Stomp Cygwin "silly rename" invalid Unicode sequence
> +     *
> +     * Cygwin has it's own variation of "silly rename" (i.e. if
> +     * someone deletes a file while someone else still has
> +     * a valid fd to that file it first renames that file with a
> +     * special prefix, see
> +     * newlib-cygwin/winsup/cygwin/syscalls.cc, function
> +     * |try_to_bin()|).
> +     *
> +     * Unfortunately on filesystems supporting Unicode
> +     * (i.e. |FILE_UNICODE_ON_DISK|) Cygwin adds the prefix
> +     * L".\xdc63\xdc79\xdc67", which is NOT a valid UTF-16 sequence,
> +     * and will be rejected by a filesystem validating the
> +     * UTF-16 sequence (e.g. SAMBA, ReFS, OpenZFS, ...).
> +     * In our case the NFSv4.1 protocol requires valid UTF-8
> +     * sequences, and the NFS server will reject filenames if either
> +     * the server or the exported filesystem will validate the UTF-8
> +     * sequence.
> +     *
> +     * Since Cygwin only does a |rename()| and never a lookup by
> +     * that filename we just stomp the prefix with the prefix used
> +     * for non-|FILE_UNICODE_ON_DISK| filesystems.
> +     * We ignore the side-effects here, e.g. that Win32 will still
> +     * "remember" the original filename in the file name cache.
> +     */
> +    if ((rename->FileNameLength > (4*sizeof(wchar_t))) &&
> +        (!memcmp(rename->FileName,
> +            L".\xdc63\xdc79\xdc67", (4*sizeof(wchar_t))))) {
> +        DPRINTF(1, ("handle_nfs41_rename(args->path='%s'): "
> +            "Cygwin sillyrename prefix \".\\xdc63\\xdc79\\xdc67\" "
> +            "detected, squishing prefix to \".cyg\"\n",
> +            args->path));
> +        (void)memcpy(rename->FileName, L".cyg", 4*sizeof(wchar_t));
> +    }
> +#endif /* CYGWIN_STOMP_SILLY_RENAME_INVALID_UTF16_SEQUENCE */
> +
>     dst_path.len = (unsigned short)WideCharToMultiByte(CP_UTF8,
>         WC_ERR_INVALID_CHARS|WC_NO_BEST_FIT_CHARS,
>         rename->FileName, rename->FileNameLength/sizeof(WCHAR),
> ---- snip ----
>

LGTM, but msys2 has a copy of that code with a different prefix, again
with invalid UTF-16:
https://github.com/msys2/msys2-runtime/blob/msys2-3.5.4/winsup/cygwin/syscalls.cc#L350C8-L350C35

Ced
-- 
Cedric Blancher <cedric.blancher@gmail.com>
[https://plus.google.com/u/0/+CedricBlancher/]
Institute Pasteur


More information about the Cygwin mailing list