bug in strxfrm
Bruno Haible
haible@ilog.fr
Mon Jul 31 08:21:00 GMT 2000
The inline function utf8_encode, used by strxfrm, currently returns 0
(instead of the result's length) if the argument is >= 0x80.
Here is a patch which
- fixes the bug,
- further optimizes this function by using simple arithmetic instead of
memory accesses.
2000-07-30 Bruno Haible <haible@clisp.cons.org>
* string/strxfrm.c (encoding_mask, encoding_byte): Remove.
(utf8_encode): Use simple shifts instead. Fix return value.
*** glibc-20000729/string/strxfrm.c.bak Wed Jan 26 23:17:05 2000
--- glibc-20000729/string/strxfrm.c Sun Jul 30 10:29:50 2000
***************
*** 46,69 ****
#ifndef WIDE_CHAR_VERSION
- /* These are definitions used by some of the functions for handling
- UTF-8 encoding below. */
- static const uint32_t encoding_mask[] =
- {
- ~0x7ff, ~0xffff, ~0x1fffff, ~0x3ffffff
- };
-
- static const unsigned char encoding_byte[] =
- {
- 0xc0, 0xe0, 0xf0, 0xf8, 0xfc
- };
-
/* We need UTF-8 encoding of numbers. */
static inline int
utf8_encode (char *buf, int val)
{
- char *startp = buf;
int retval;
if (val < 0x80)
--- 47,57 ----
***************
*** 76,86 ****
int step;
for (step = 2; step < 6; ++step)
! if ((val & encoding_mask[step - 2]) == 0)
break;
retval = step;
! *buf = encoding_byte[step - 2];
--step;
do
{
--- 64,74 ----
int step;
for (step = 2; step < 6; ++step)
! if ((val & (~(uint32_t)0 << (5 * step + 1))) == 0)
break;
retval = step;
! *buf = (unsigned char) (~0xff >> step);
--step;
do
{
***************
*** 91,97 ****
*buf |= val;
}
! return buf - startp;
}
#endif
--- 79,85 ----
*buf |= val;
}
! return retval;
}
#endif
More information about the Libc-alpha
mailing list