Cygwin fails to utilize Unicode replacement character

Steven Penny svnpenn@gmail.com
Sat Sep 1 18:46:00 GMT 2018


On Sat, 1 Sep 2018 20:11:15, Thomas Wolff wrote:
> Which terminals are used and what's the output of `locale` and `cat 
> --version` in both cases?

Linux:

    $ echo "$TERM"
    xterm-256color

    $ locale
    LANG=en_US.UTF-8
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC="en_US.UTF-8"
    LC_TIME="en_US.UTF-8"
    LC_COLLATE=C
    LC_MONETARY="en_US.UTF-8"
    LC_MESSAGES="en_US.UTF-8"
    LC_PAPER="en_US.UTF-8"
    LC_NAME="en_US.UTF-8"
    LC_ADDRESS="en_US.UTF-8"
    LC_TELEPHONE="en_US.UTF-8"
    LC_MEASUREMENT="en_US.UTF-8"
    LC_IDENTIFICATION="en_US.UTF-8"
    LC_ALL=

    $ cat --version
    cat (GNU coreutils) 8.29

Cygwin:

    $ echo "$TERM"
    cygwin

    $ locale
    LANG=en_US.UTF-8
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC="en_US.UTF-8"
    LC_TIME="en_US.UTF-8"
    LC_COLLATE="C"
    LC_MONETARY="en_US.UTF-8"
    LC_MESSAGES="en_US.UTF-8"
    LC_ALL=

    $ cat --version
    cat (GNU coreutils) 8.26

Note that in addition to Linux, Windows PowerShell also gives correct output:

    $ pwsh -c '[system.text.encoding]::UTF8.getString(0xEB)'
    �

compare again with Cygwin:

    $ printf '\xEB'
    â–’


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list