Adding mbstate_t, mbsinit(), mbrtowc(), mbrlen() etc.

J. Johnston jjohnstn@redhat.com
Wed Sep 4 17:32:00 GMT 2002


Kazuhiro Fujieda wrote:
> 
> >>> On Thu, 22 Aug 2002 21:50:50 +0400
> >>> egor duda <deo@logos-m.ru> said:
> 
> >   I'm preparing a patch to add restartable versions of multibyte
> > conversion functions to newlib. As long as all state information is
> > already handled by *_r() versions, this functions are just simple
> > wrappers around foo() of foo_r() functions, depending on MB_CAPABLE.
> 
> The approach wrapping mb*_r() in mbr*() can't realize the
> behavior standardized in C99 (or C90 Amendment1).
> 
> The `mbrtowc()' is required to accept incomplete multibyte
> characters and store its state indicating such incompleteness
> for successive conversions, while mbtowc_r() can't accept
> incomplete multibyte characters.
> 
> We have to rewrite the MB_CAPABLE version of mbtowc_r() to
> realize this behavior.
> 
> > mbstate_t as struct { int; union { wchar_t; char[4] }}, while
> > Microsoft's C runtime defines it as int. Would 'int' be enough for
> > everything?
> 
> No, int may be enough but inconvenient to represent the state
> indicating incomplete multibyte characters in JIS or UTF8 encoding
> (the current mbtowc_r() support these encoding).
> The mbstate_t needs to represent the conversion state itself and
> any incomplete sequence in these encodings.
> ____
>   | AIST      Kazuhiro Fujieda <fujieda@jaist.ac.jp>
>   | HOKURIKU  Center for Information Science
> o_/ 1990      Japan Advanced Institute of Science and Technology

Kazuhiro,

  This is just to let you know that I am working on fixing the behavior
of the restartable functions to work properly.  I already have C-JIS working for _mbrtowc_r
which should be the most difficult of the bunch.  I wanted you to know so you 
do not waste any time by making similar modifications.

Regards,

-- Jeff J.



More information about the Newlib mailing list