Fwd: How to printf Japanese Word

Qiang Wang rurality.wq@gmail.com
Thu May 28 10:25:00 GMT 2009


hi,Jeff and IWAMURO

I recompile the newlib with the --enable-newlib-mb and the result
became all right.
I can print out the Japanese correctly without the locale setting.

Thank you very much.
And I think I can help doing some work for newlib if it is necessary.

best regards
wangqiang

2009/5/28 Jeff Johnston <jjohnstn@redhat.com>:
> Qiang Wang wrote:
>>
>> hi,
>> TO Jeff:
>> you are right.
>> the hex representation is as below.
>> 8A 4A 82 AF 82 E9 20 6F 70 65 6E 20 2F 6A 66 66
>> 73 32 2F 50 65 72 73 6F 6E 5F 58 47 41 2E 72 73
>> 65 6E 63 20 66 61 69 6C 65 64 0A 00
>>
>> It mixes up with Japanese & English words and end with "0A".
>> You mean that it is a bug or something like it?
>>
>>
>
> No.  I was wondering if a Japanese character contained 0A as one of its
> bytes.  The newline
> character as a single-byte character at the end of your string is fine.  I
> suppose I could have looked it up to see if 0A is a valid MSB or LSB of a
> multibyte character.  In any case, the code would not be prepared for this.
>
> The printf code is implemented by libc/stdio/vfprintf.c.  There is a code
> sequence there as follows:
>
>       for (;;) {
>               cp = fmt;
> #ifdef _MB_CAPABLE
>               while ((n = _mbtowc_r (data, &wc, fmt, MB_CUR_MAX, &state)) >
> 0) {
>                   if (wc == '%')
>                       break;
>                   fmt += n;
>               }
> #else
>               while (*fmt != '\0' && *fmt != '%')
>                   fmt += 1;
> #endif
>               if ((m = fmt - cp) != 0) {
>                       PRINT (cp, m);
>                       ret += m;
>               }
>
> Note the _MB_CAPABLE section.  If you have configured newlib with
> --enable-newlib-mb, then that is called to determined where the single-byte
> format characters are vs bytes in the middle of a multibyte string.
>  Multiple characters up to the next single-byte format specifier ('%') or
> the nul terminator are output via the PRINT macro.
> This ends up calling __sfvwrite().
>
> Now, you can do some tests to verify that calling mbtowc (just omit the data
> parameter in the above _mbtowc_r call) with your string is returning the
> correct output.  You can also debug your application and break at
> __sfvwrite_r() (fvwrite.c) which is ultimately called by the PRINT macro
> above.  Look at the input struct  uio to verify your data is there and the
> length is set correctly.  You can then follow along in the function to see
> if any errors occur before the data gets shipped out to the write syscall.
>  You can also break in the write syscall itself to see what is finally being
> sent.  Finally, you can try an fwrite of the string directly to see if that
> works which verifies printf is doing something funny.
>
> I am assuming you are simply calling printf("your_string_here").
>
> If not, please note here how you are calling printf and verify that you have
> <stdio.h> included.
>
> -- Jeff J.
>>
>> TO IWAMURO Motonori
>> I do not use the wprintf but printf. I also use the C++ language which
>> print out words with cout.
>>
>> So I hope the printf and cout will work well with Japanese.
>>
>> thank you all very much.
>>
>> best regards
>> wangqiang
>>
>> 2009/5/27 IWAMURO Motonori <deenheart@gmail.com>:
>>
>>>
>>> Hmmm, when you use wide character I/O function (like "wprintf"):
>>> - case1: missing setlocale?
>>> - case2: missing settings of LANG environment variable?
>>> - case3: MB <-> WC converter is broken?
>>> - case4: write the literal L"...Japanese Character..." in the source
>>> directly? (and gcc can't convert MB to WC)
>>>
>>> 2009/5/26 Qiang Wang <rurality.wq@gmail.com>:
>>>
>>>>
>>>> hi, thanks for your answer.
>>>> I use arm-elf-gcc and run the newlib in the arm target board.
>>>> I print out the string through the serial cable.
>>>> If I use my printf_me, the Japanese will be printed very well.
>>>> I have found that printf in newlib will cut the MSB of the byte.
>>>> I even modify some MACRO and re-compile the newlib to support wide
>>>> character string.
>>>> But I can not get the Japanese work print .
>>>>
>>>> best regards
>>>> wangqiang
>>>>
>>>> 2009/5/25 IWAMURO Motonori <deenheart@gmail.com>:
>>>>
>>>>>
>>>>> Hi.
>>>>>
>>>>> Which is the version of Cygwin 1.5 or 1.7?
>>>>> What is the terminal emulator that you are using?
>>>>> What is the encoding of the source? (UTF-8, Shift_JIS (CP932), or
>>>>> EUC-JP?)
>>>>>
>>>>> 2009/5/25 Qiang Wang <rurality.wq@gmail.com>:
>>>>>
>>>>>>
>>>>>> hi, all
>>>>>> I try to print out the Japanese word with printf or cout.
>>>>>> But the Japanese string have been converted into other words.
>>>>>> So how can I print out the word.
>>>>>>
>>>>>> thank you very much.
>>>>>>
>>>>>> best reguards
>>>>>> wangqiang
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> IWAMURO Motnori <http://vmi.jp/>
>>>>>
>>>>>
>>>
>>> --
>>> IWAMURO Motnori <http://vmi.jp/>
>>>
>>>
>
>



More information about the Newlib mailing list