This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: BUG: %lc in printf fails with transliteration
- To: libc-alpha at sources dot redhat dot com
- Subject: Re: BUG: %lc in printf fails with transliteration
- From: Markus Kuhn <Markus dot Kuhn at cl dot cam dot ac dot uk>
- Date: Mon, 25 Sep 2000 09:52:00 +0100
Ulrich Drepper wrote on 2000-09-25 06:46 UTC:
> There is a subtle difference to your other
> test case since here you are using the byte stream function. This
> means that there is not codecvt structure associated with the stream.
> Instead, the wc*tomb* functions are used. But these cannot handle
> transliteration.
It is disappointing to hear that the wc*tomb* cannot handle
transliteration. In addition, the C standard very clearly requires that
(apart from the fwide() state issues)
wint_t c = L'ü';
printf("'%lc'\n", c);
and
wint_t c = L'ü';
wprintf("'%lc'\n", c);
behave exactly identically. Both have to convert wide characters to
output bytes "as if by a call to the wcrtomb function".
For those with doubts, the standard says that
a) %lc in fprintf shall convert like %ls (§7.19.6.1, #8)
b) %ls in fprintf shall convert "as if by a call to
the wcrtomb function" (§7.19.6.1, #8)
c) and most important of all (§7.19.3, #12):
The wide character output functions convert wide
characters to multibyte characters and write them to the
stream as if they were written by successive calls to the
fputwc function. Each conversion occurs as if by a call to
^^^^^^^^^^^^^^^^^^^
the wcrtomb function, with the conversion state described by
^^^^^^^^^^^^^^^^^^^^
the stream's own mbstate_t object. The byte output
functions write characters to the stream as if by successive
calls to the fputc function.
If instead you apply different conversion principles to printf() and
wprintf(), then how should a function like wcwidth() be able to make a
prediction of the number of terminal emulator cells occupied by the
output without knowing whether the output will go the byte or wide
stream?
> So, this is a feature, not a bug. One might want to remove it at some
> time but it's not high on the priority list.
I think the above quotations from the standard require beyond any doubt
and room for interpretation that the "subtle difference" that you talked
about is a real violation of the standard (aka "bug") of the current
glibc and not just a "feature". You should not consider the wide
character I/O system ready for release before this is fixed.
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>