bug in SJIS converter

Bruno Haible haible@ilog.fr
Mon Jul 31 08:26:00 GMT 2000


The iconv converter module for Japanese SJIS has a bug: in the SJIS to Unicode
direction, it accepts nonexistent input characters in the ranges
   0x8208..0x823C
   0x8300..0x8331
   ...
   0xEA00..0xEA3C
and returns nonsense values (as if applying a pre-mapping
(ch1,ch2) -> (ch1-1,ch2+0xC0)).

Here is a fix.

2000-07-30  Bruno Haible  <haible@clisp.cons.org>

	* iconvdata/sjis.c (BODY for FROM_LOOP): Treat the case
	ch >= 0x81 && ch2 < 0x40 as invalid.

*** glibc-20000729/iconvdata/sjis.c.bak	Wed Jul 12 18:11:43 2000
--- glibc-20000729/iconvdata/sjis.c	Sun Jul 30 23:49:05 2000
***************
*** 4388,4394 ****
  									      \
  	ch2 = inptr[1];							      \
  	idx = ch * 256 + ch2;						      \
! 	if (__builtin_expect (idx, 0x8140) < 0x8140			      \
  	    || (__builtin_expect (idx, 0x8140) > 0x84be && idx < 0x889f)      \
  	    || (__builtin_expect (idx, 0x8140) > 0x88fc && idx < 0x8940)      \
  	    || (__builtin_expect (idx, 0x8140) > 0x9ffc && idx < 0xe040)      \
--- 4388,4395 ----
  									      \
  	ch2 = inptr[1];							      \
  	idx = ch * 256 + ch2;						      \
! 	if (__builtin_expect (ch < 0x81, 0)				      \
! 	    || __builtin_expect (ch2 < 0x40, 0)				      \
  	    || (__builtin_expect (idx, 0x8140) > 0x84be && idx < 0x889f)      \
  	    || (__builtin_expect (idx, 0x8140) > 0x88fc && idx < 0x8940)      \
  	    || (__builtin_expect (idx, 0x8140) > 0x9ffc && idx < 0xe040)      \


More information about the Libc-alpha mailing list