WSL symbolic links

Thomas Wolff towo@towo.net
Fri Mar 27 14:43:43 GMT 2020


Am 27.03.2020 um 14:01 schrieb Corinna Vinschen:
> On Mar 27 13:24, Thomas Wolff wrote:
>> Am 27.03.2020 um 12:21 schrieb Corinna Vinschen:
>>> On Mar 27 00:52, Thomas Wolff wrote:
>>>> [...]
>>>>> rd-reparse '\??\C:\tmp\link' ; echo
>>>> ReparseTag:           0xa000001d
>>>> ReparseDataLength:             8
>>>> Reserved:                      0
>>>> 02 00 00 00 66 69 6c 65
>>>>> rd-reparse '\??\C:\tmp\link-abs' ; echo
>>>> ReparseTag:           0xa000001d
>>>> ReparseDataLength:            19
>>>> Reserved:                      0
>>>> 02 00 00 00 2f 6d 6e 74 2f 63 2f 74 6d 70 2f 66
>>>> 69 6c 65
>>>>> rd-reparse '\??\C:\tmp\link-foo' ; echo
>>>> ReparseTag:           0xa000001d
>>>> ReparseDataLength:             9
>>>> Reserved:                      0
>>>> 02 00 00 00 66 c3 b6 c3 b6
>>>>> rd-reparse '\??\C:\tmp\link-foo-abs' ; echo
>>>> ReparseTag:           0xa000001d
>>>> ReparseDataLength:            20
>>>> Reserved:                      0
>>>> 02 00 00 00 2f 6d 6e 74 2f 63 2f 74 6d 70 2f 66
>>>> c3 b6 c3 b6
>>> [...]
>>> I debugged this now and I found that practically all problems, including
>>> the inability to delete the symlink, are a result of not being able to
>>> open the reparse point correctly as reparse point within Cygwin.  So as
>>> not to destroy something important, Cygwin only opens reparse points as
>>> reparse points if it recognizes the reparse point type.
>>>
>>> Consequentially, all immediate problems go away, as soon as Cygwin
>>> recognizes and handles the symlink :)
>>>
>>> So I created a patch and pushed it.  The latest developer snapshot from
>>> https://cygwin.com/snapshots/ contains this patch.
>> Works, great, thank you!
> Thanks for testing!
>
>>> Funny sidenote: Assuming you create symlinks pointing to files with
>>> non-UTF-8 chars, e. g., umlauts in ISO-8859-1, then the symlink converts
>>> *all* these chars to the Unicode REPLACEMENT CHARACTER 0xfffd.  I assume
>>> this will also happen if you try to create the file with these chars in
>>> the first place, so it's not much of a problem.
>> As Windows filenames are character strings as opposed to Linux filenames
>> which are byte strings, some strange behaviour is unavoidable. I see:
>> $ wsl ls -l link_LW
>> lrwxrwxrwx    1 towo     towo            19 Mar 27 12:11 link_LW ->
>> file_L_
>> $ ls -l link_LW
>> lrwxrwxrwx 1 towo Kein 11 27. Mrz 13:11 link_LW -> file_L_����
>> which looks OK for me.
> Not sure I expressed myself correctly there.  What I was trying to say
> is, the symlink created by WSL already contains the 0xfffd replacement
> char, in UTF-8 \xef \xbf \xbd.  So the info is already lost inside the
> symlink.
I couldn't create a non-UTF8 file name in WSL on the command line; even 
running LC_ALL=de_DE mintty and running WSL LC_ALL=de_DE bash, keyboard 
input would still appear as UTF-8 when displayed with od, which is weird.
Anyway, this can be tricked using touch from a script file of course. In 
that case, indeed WSL flattens all invalid characters to � already for 
the filename.
However, all symbolic link cases work for me. I can point links to 
file_L_ and file_LW_���� and access the respective files correctly 
via the links from both WSL and cygwin now.
Thomas



More information about the Cygwin mailing list