/bin/ls -l cannot handle printable Unicode characters outside the BMP ...
Thomas Wolff
towo@towo.net
Sun Nov 24 10:30:18 GMT 2024
Am 23.11.2024 um 15:01 schrieb Christian Franke via Cygwin:
> Cedric Blancher via Cygwin wrote:
>> On Sat, 23 Nov 2024 at 11:44, Cedric Blancher
>> <cedric.blancher@gmail.com> wrote:
>>> Good morning!
>>>
>>> /bin/ls -l cannot handle printable Unicode characters outside the BMP
>>>
>>> Example using '𝒯'
>>> bash -c 'printf "\U0001D4AF\n"' # MATHEMATICAL SCRIPT CAPITAL T
>>> (yes, our mathematicians want to use THAT as file name)
>>>
>>> On Linux:
>>> LC_ALL=en_US.UTF-8 bash -c 't="$(printf "\U0001D4AF\n")" ; touch
>>> "$t" "$t$t"'
>>> ls -la
>>> total 8
>>> -rw-r--r-- 1 ced staden 0 Nov 23 11:29 ööööööö
>>> -rw-r--r-- 2 ced staden 4 Nov 23 11:31 𝒯
>>> -rw-r--r-- 2 ced staden 4 Nov 23 11:31𝒯𝒯
>>>
>>> On Cygwin:
>>> LC_ALL=en_US.UTF-8 bash -c 't="$(printf "\U0001D4AF\n")" ; touch
>>> "$t" "$t$t"'
>>> $ ls -la
>>> -rw-r--r-- 1 ced staden 0 Nov 23 11:29 ööööööö
>>> -rw-r--r-- 2 ced staden 4 Nov 23 11:31 ''$'\360\235\222\257'
>>> -rw-r--r-- 2 ced staden 4 Nov 23 11:31
>>> ''$'\360\235\222\257\360\235\222\257'
>>>
>>> Looks like the Cygwin locale has a problem with non-BMP chars.
>> find(1) is even worse:
>> $ find .
>> .
>> ./ööööööö
>> ./????
>> ./x??x
Workaround: ls ... | cat ; find ... | cat
>>
>> The Microsoft Explorer GUI shows the file names correctly, so IMO this
>> is not a Windows or Win32 API problem.
>
> Slightly different filename problem which may be related or not:
> https://sourceware.org/pipermail/cygwin/2024-September/256451.html
>
More information about the Cygwin
mailing list