Bug 2413

Summary: gencat treats second 0x5c byte _JP.sjis sequence as continuation
Product: glibc Reporter: Paul Brett <pmbrett>
Component: localedataAssignee: GNU C Library Locale Maintainers <libc-locales>
Status: RESOLVED INVALID    
Severity: normal CC: glibc-bugs, rsa, sjmunroe
Priority: P2 Flags: fweimer: security-
Version: 2.3.3   
Target Milestone: ---   
Host: x86_64 Target: x86_64
Build: x86_64 Last reconfirmed:

Description Paul Brett 2006-03-02 22:58:50 UTC
In the simple ja_JP.sjis message file

$ /bin/echo -e "1 \224\\" >/tmp/tt

$ hdump /tmp/tt                   
00000000 31 20 94 5C 0A                                  1 .\.

gencat gives an error 

$ LANG=ja_JP.sjis gencat --new /tmp/tt.out /tmp/tt
/tmp/tt:1: invalid character: message ignored

This input file contains a single message numbered 1 containing the SJIS 
character represented by the double byte sequence 0x94 0x5C

The 0x5C seems to be treated as an ASCII backslash which is a continutation 
char in gencat
Comment 1 Ulrich Drepper 2006-05-02 03:24:01 UTC
You cannot use SJIS as a locale, it is not ASCII-safe.