[PATCH] CJK ambiguous width for non-Unicode charsets
Andy Koppe
andy.koppe@gmail.com
Fri Nov 26 04:42:00 GMT 2010
On 18 November 2010 11:03, Corinna Vinschen wrote:
> On Nov 17 21:34, Andy Koppe wrote:
>> On 16 November 2010 17:58, Corinna Vinschen wrote:
>> > On Nov 9 22:06, Andy Koppe wrote:
>> >> The attached small patch affects character widths as reported by
>> >> wcwidth(). It addresses an obscure issue.
>> >>[...]
>> >> * libc/locale/locale.c: Fix ambigous width to one for singlebyte
>> >> charsets and two for non-Unicode multibyte charsets.
>> >
>> > This appears to make a lot of sense. Would you mind to enhance your
>> > patch slightly to fix also the description in the locale.c
>> > documentation? There's a related paragraph starting with "This
>> > implementation also supports a single modifier, <<"cjknarrow">>..."
>>
>> Sorry, I hadn't seen that. Amended patch attached.
>>
>> * libc/locale/locale.c (loadlocale): Fix width of CJK ambigous
>> characters to 1 for singlebyte charsets and 2 for non-Unicode
>> multibyte charsets. Change documentation accordingly.
>
> Thank you. Applied with a minor change. @ is a special character
> in the docs and has to be doubled ("@@") to be treated literally.
> I just removed it entirely since the @ is not part of the modifier
> itself.
Thanks.
In further testing I realised that the cjknarrow modifier wasn't
implemented for "C.<charset>" locales (since previously there was no
point in that). Patch attached to make it work.
* libc/locale/locale.c (loadlocale): Recognise the "cjknarrow"
modifier on "C.<charset>" locales too.
Here's a small test for this:
$ cat width.c
#include <wchar.h>
#include <locale.h>
#include <stdio.h>
int main(void) {
setlocale(LC_CTYPE, "");
puts(setlocale(LC_CTYPE, 0));
puts(wcwidth(0xA1) == 1 ? "narrow" : "wide");
}
$ cc width.c
$ ./a
C.UTF-8
narrow
$ LANG=C.GBK ./a
C.GBK
wide
$ LANG=C.GBK@cjknarrow ./a
C.GBK@cjknarrow
narrow
$ LANG=ja_JP.UTF-8 ./a
ja_JP.UTF-8
wide
$ LANG=ja_JP.UTF-8@cjknarrow ./a
ja_JP.UTF-8@cjknarrow
narrow
$ LANG=de_DE.UTF-8 ./a
de_DE.UTF-8
narrow
Andy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ambiwidth3.patch
Type: application/octet-stream
Size: 1868 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/newlib/attachments/20101126/462ffe7e/attachment.obj>
More information about the Newlib
mailing list