[BUG REPORT]sed -e 's/[B-D]/_/g' replaces unexpected characters

Lavrentiev, Anton (NIH/NLM/NCBI) [C] lavr@ncbi.nlm.nih.gov
Tue Jun 25 15:46:00 GMT 2013


> Your locale is zh_CN.UTF-8.  What you're expecting is only guaranteed
> in the C locale:

I'm not quite sure it applies here.  I'm using US English Windows 7.

LANG = 'en_US.UTF-8'

I get the same result:

$ echo abcdeABCDE | sed -e 's/[B-D]/_/g'
ab__eA___E

BUT:

$ echo abcdeABCDE | LANG=C sed 's/[B-D]/_/g'
abcdeA___E

This is very weird, indeed.

OTOH, in Linux I have the same LANG setup, yet it does work
correctly:

> echo $LANG
en_US.UTF-8
> echo abcdeABCDE | sed -e 's/[B-D]/_/g'
abcdeA___E

I believe that an en_US UTF-8 string representation for
"abcdeABCDE" is not any different from ASCII.

Anton Lavrentiev
Contractor NIH/NLM/NCBI



More information about the Cygwin mailing list