Regexps cyrillic locale-dependent inclusion

Alexandre Garreau galex-713@galex-713.eu
Tue Jan 22 23:50:00 GMT 2019


I checked various letters found on wikipedia [0].

My main system locale is fr_FR.UTF-8.

It goes from standard russian:

(string-match "[а-я]" "ё") => nil

grep '[а-я]' <<<'ё' => ё

LC_ALL=C grep '[а-я]' <<< 'ё' => ё

To very uncommon letters (only present in abkhaz (caucasian 
language), according wikipedia):

(string-match "[а-я]" "ҩ") => nil

grep '[а-я]' <<< 'ҩ' => ҩ

LC_ALL=C grep '[а-я]' <<<'ҩ' => nil

What is this difference? why a such difference?

[0] https://en.wikipedia.org/wiki/List_of_Cyrillic_letters



More information about the Libc-help mailing list