Crash (and crude workaround) for glibc2.x wcsxfrm()
neideck@qkal.sap-ag.de
neideck@qkal.sap-ag.de
Thu Apr 22 03:22:00 GMT 1999
> > From: neideck@qkal.sap-ag.de
> > Feeding wchar_t strings with characters outside the range 0-255 into
> > the wcsxfrm() function when under any of the 8-bit locales (such as de_DE)
> > leads to a crash. This works on other operating systems such as HP/UX and
> > Digital Unix (and basically these characters are ignored).
>
> By 'ignored', you mean deleted from the string, or passed through unchanged?
It depends. HP/UX 10.20 just refuses to map the strings and returns -1 (i.e.
mapping error). Digital Unix V4.0D implements something that resembles
my fix, i.e. it clears out the upper bits of the values. On retrospect,
the HP/UX behaviour is more reasonable and the Digital Unix manpage itself
suggests that it also shouldn't do what it does. Quoting the Digital Unix
man page:
"On error, the wcsxfrm() function returns (size_t)-1 and sets errno to indi-
cate the error.
ERRORS
If any the following conditions occur, the wcsxfrm() function sets errno to
the corresponding value:
[EINVAL] The ws2 parameter contains wide-character codes outside the
domain of the collating sequence defined by the current locale.
"
> > <fix to suppress upper bits deleted>
> Certainly this is wrong. Probably the right thing to do is pass the
> character through unchanged, because wcscoll should treat such
> characters like wcscmp does.
Passing the character unchanged leads to a crash in get_weight. Both
wcsxfrm() and wcscoll() should check for characters outside the input
range. wcsxfrm() then does no translation and returns -1, wcscoll does
whatever it wants and sets "errno" to EINVAL (since it cannot return
an error code).
The thing I couldn't figure out up until now is, where I find the valid
range of characters for the current locale (it seems to be hardcoded at
8 bits).
I've attached a program to demonstrate the problem this time.
Burkhard Neidecker-Lutz
CEC Karlsruhe , SAP AG, neideck@qkal.sap-ag.de
More information about the Libc-alpha
mailing list