/bin/ls -l cannot handle printable Unicode characters outside the BMP ...

Cedric Blancher cedric.blancher@gmail.com
Sat Nov 23 10:44:00 GMT 2024


Good morning!

/bin/ls -l cannot handle printable Unicode characters outside the BMP

Example using '𝒯'
bash -c 'printf "\U0001D4AF\n"' # MATHEMATICAL SCRIPT CAPITAL T
(yes, our mathematicians want to use THAT as file name)

On Linux:
LC_ALL=en_US.UTF-8 bash -c 't="$(printf "\U0001D4AF\n")" ; touch "$t" "$t$t"'
ls -la
total 8
-rw-r--r--  1 ced staden  0 Nov 23 11:29 ööööööö
-rw-r--r--  2 ced staden  4 Nov 23 11:31 𝒯
-rw-r--r--  2 ced staden  4 Nov 23 11:31𝒯𝒯

On Cygwin:
LC_ALL=en_US.UTF-8 bash -c 't="$(printf "\U0001D4AF\n")" ; touch "$t" "$t$t"'
$ ls -la
-rw-r--r-- 1 ced staden  0 Nov 23 11:29  ööööööö
-rw-r--r-- 2 ced staden  4 Nov 23 11:31 ''$'\360\235\222\257'
-rw-r--r-- 2 ced staden  4 Nov 23 11:31 ''$'\360\235\222\257\360\235\222\257'

Looks like the Cygwin locale has a problem with non-BMP chars.

Ced
-- 
Cedric Blancher <cedric.blancher@gmail.com>
[https://plus.google.com/u/0/+CedricBlancher/]
Institute Pasteur


More information about the Cygwin mailing list