This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: New GB18030 gconv module for glibc (from ThizLinux Laboratory)
> > No, this is wrong. As I have said, in the Unicode standard, U+33FF
> > is "legal" but "unassigned". If gb18030.c says it is "illegal", it is
> > glibc's bug.
>
> No. This is an incorrect input. Period. There is no discussion
> about it. I've already said that no character which is not in the
> current UnicodeData list must be converted.
Can you then please explain why the attached program prints "33ff" in
the current glibc? It should fail, since it converts a character that
is not in the current UnicodeData.
Regards,
Martin
#include <iconv.h>
#include <stdio.h>
char msg[] = "\xe3\x8f\xbf";
wchar_t result[10];
int main()
{
iconv_t conv;
int inbytes = sizeof(msg);
int outbytes = sizeof(result)*sizeof(wchar_t);
char *in = msg;
char *out = (char*)result;
conv = iconv_open("UCS-4LE","UTF-8");
if (iconv(conv, &in, &inbytes, &out, &outbytes) == -1){
perror("iconv");
} else
printf("%x\n", result[0]);
}