This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: BUG: %lc in printf fails with transliteration


Ulrich Drepper wrote on 2000-09-25 06:46 UTC:
> There is a subtle difference to your other
> test case since here you are using the byte stream function.  This
> means that there is not codecvt structure associated with the stream.
> Instead, the wc*tomb* functions are used.  But these cannot handle
> transliteration.

It is disappointing to hear that the wc*tomb* cannot handle
transliteration. In addition, the C standard very clearly requires that
(apart from the fwide() state issues)

  wint_t c = L'ü';
  printf("'%lc'\n", c);

and

  wint_t c = L'ü';
  wprintf("'%lc'\n", c);

behave exactly identically. Both have to convert wide characters to
output bytes "as if by a call to the wcrtomb function".

For those with doubts, the standard says that

  a) %lc in fprintf shall convert like %ls (§7.19.6.1, #8)
  b) %ls in fprintf shall convert "as if  by  a  call  to
     the  wcrtomb function" (§7.19.6.1, #8)
  c) and most important of all (§7.19.3, #12):

       The  wide  character  output  functions  convert wide
       characters to multibyte characters and  write  them  to  the
       stream  as  if  they were written by successive calls to the
       fputwc function.  Each conversion occurs as if by a call  to
                                                ^^^^^^^^^^^^^^^^^^^
       the wcrtomb function, with the conversion state described by
       ^^^^^^^^^^^^^^^^^^^^
       the  stream's  own  mbstate_t  object.   The   byte   output
       functions write characters to the stream as if by successive
       calls to the fputc function.

If instead you apply different conversion principles to printf() and
wprintf(), then how should a function like wcwidth() be able to make a
prediction of the number of terminal emulator cells occupied by the
output without knowing whether the output will go the byte or wide
stream?

> So, this is a feature, not a bug.  One might want to remove it at some
> time but it's not high on the priority list.

I think the above quotations from the standard require beyond any doubt
and room for interpretation that the "subtle difference" that you talked
about is a real violation of the standard (aka "bug") of the current
glibc and not just a "feature". You should not consider the wide
character I/O system ready for release before this is fixed.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]