[Bug localedata/22387] Replace unicode sequences <Uxxxx> for characters inside the ASCII printable range

joseph at codesourcery dot com sourceware-bugzilla@sourceware.org
Thu Nov 9 16:31:00 GMT 2017


https://sourceware.org/bugzilla/show_bug.cgi?id=22387

--- Comment #21 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
On Thu, 9 Nov 2017, keld at keldix dot com wrote:

> Yes all source files should be converted from Ascii to the ebcdic in question.
> This is also the case on UTF-16 systems, the source files should be converted
> from some sort of ascii compatible encoding to UTF-16. Or the other way - if
> you
> move sources from a non ascii-compatible system to an ascii-compatible system.
> 
> This process can be done automatically using eg iconv.

No, it can't be done automatically, without having information somewhere 
about which character set each source file is in (it's entirely possible 
some, e.g. those representing expected output of testcases, are in mixed 
character sets - and in any case represent particular sequences of octets 
that must be preserved because they are to be compared against test output 
in particular locales).

glibc does not make any attempt to support locales that are not more or 
less ASCII compatible, and does not make any attempt to support 16-bit 
bytes (which are not supported by POSIX either) which would be needed for 
UTF-16 to be a valid locale encoding.  We should not pretend that it does, 
any more than we should pretend it supports non-ELF object formats.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Libc-locales mailing list