This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

gcc ignores locale (no UTF-8 source code supported)


If I compile a program (source code encoded in ISO 8859-1) that contains
the line

  wprintf(L"Schöne Grüße!\n");

with

  LANG=en_GB.ISO-8859-1 gcc -W -Wall -O widetest.c -o widetest

then I get correctly in the produced binary the UCS-4 encoded wide
string

000005c0  53 00 00 00 63 00 00 00  68 00 00 00 f6 00 00 00  S...c...h...ö...
000005d0  6e 00 00 00 65 00 00 00  20 00 00 00 47 00 00 00  n...e... ...G...
000005e0  72 00 00 00 fc 00 00 00  df 00 00 00 65 00 00 00  r...ü...ß...e...
000005f0  21 00 00 00 0a 00 00 00  00 00 00 00 00 00 00 00  !...............

However, if I accidentally work in a UTF-8 locale and compile the
ISO 8859-1 source code with

  LANG=en_GB.UTF-8 gcc -W -Wall -O widetest.c -o widetest

then no warning message is issued and the resulting binary still
contains the result of the above ISO 8859-1 -> UCS-4 translation.

It seems that gcc ignores the locale and does not use glibc's multi-byte
decoding functions to read in wide-string literals. :-(

$ gcc -v
Reading specs from /usr/lib/gcc-lib/i486-suse-linux/2.95.2/specs
gcc version 2.95.2 19991024 (release)

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]