bash 5.2.21-1: a bug in [0-9] expansion

Brian Inglis Brian.Inglis@SystematicSW.ab.ca
Mon Sep 1 17:19:18 GMT 2025


On 2025-08-31 13:06, Mariusz Wodzicki via Cygwin wrote:
> Description of the problem.
> [0-9]  picks also certain Unicode superscript characters ( namely, ⁰ ⁴ ⁵ ⁶
> ⁷ ⁸ ⁹ ), and every Unicode subscript character.
> 
> Example: the directory has the following files:
> $ /bin/ls
> ₀.txt  ₁.txt  ₂.txt  ₃.txt  ₄.txt  ₅.txt  ₆.txt  ₇.txt  ₈.txt  ₉.txt
> ⁰.txt  ¹.txt  ².txt  ³.txt  ⁴.txt  ⁵.txt  ⁶.txt  ⁷.txt  ⁸.txt  ⁹.txt
> 
> $ /bin/ls [0-9].txt
> ₀.txt  ₁.txt  ₃.txt  ⁴.txt  ⁵.txt  ⁶.txt  ⁷.txt  ⁸.txt
> ⁰.txt  ₂.txt  ₄.txt  ₅.txt  ₆.txt  ₇.txt  ₈.txt
> 
> $ locale
> LANG=en_US.UTF-8
> LC_CTYPE="en_US.UTF-8"
> LC_NUMERIC="en_US.UTF-8"
> LC_TIME="en_US.UTF-8"
> LC_COLLATE="en_US.UTF-8"
> LC_MONETARY="en_US.UTF-8"
> LC_MESSAGES="en_US.UTF-8"
> LC_ALL=
> 
> System.
> Fully up to date Windows 11
> cygwin 3.6.4-1
> bash    5.2.21-1

For reproducible results prefix commands with LC_ALL=C … or possibly just 
LC_COLLATE=C or LC_CTYPE=C or =POSIX to standardize the locale, otherwise many 
commands will respect the current locale, and some respect Unicode regardless of 
locale e.g. `info wc`:

"Unless the environment variable ‘POSIXLY_CORRECT’ is set, GNU ‘wc’ treats the 
following Unicode characters as white space even if the current locale does not: 
U+00A0 NO-BREAK SPACE, U+2007 FIGURE SPACE, U+202F NARROW NO-BREAK SPACE, and 
U+2060 WORD JOINER."

For GNU utilities, where info pages are preferred, such as coreutils*, compiler 
and language processors, and tools packages, many details do not appear in the 
man pages, for example:

"Full documentation <https://www.gnu.org/software/coreutils/wc> or available 
locally via: info '(coreutils) wc invocation'"

although `info wc` shows the same page.

—————
* [ arch b2sum base32 base64 basename cat chcon chgrp chmod chown chroot cksum 
comm cp csplit cut date dd df dir dircolors dirname du echo env expand expr 
factor false fmt fold gkill groups head hostid id install join link ln logname 
ls md5sum mkdir mkfifo mknod mktemp mv nice nl nohup nproc numfmt od paste 
pathchk pinky pr printenv printf ptx pwd readlink realpath rm rmdir runcon seq 
sha1sum sha224sum sha256sum sha384sum sha512sum shred shuf sleep sort split stat 
stdbuf stty sum sync tac tail tee test timeout touch tr true truncate tsort tty 
uname unexpand uniq unlink users vdir wc who whoami yes

-- 
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retrancher  but when there is no more to cut
                                 -- Antoine de Saint-Exupéry


More information about the Cygwin mailing list