This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug libc/20639] Inconsistency in definitions of fputwc(), putwc() and putwchar() in glibc
- From: "igor.liferenko at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Fri, 07 Oct 2016 02:33:40 +0000
- Subject: [Bug libc/20639] Inconsistency in definitions of fputwc(), putwc() and putwchar() in glibc
- Auto-submitted: auto-generated
- References: <bug-20639-131@http.sourceware.org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=20639
--- Comment #5 from Igor Liferenko <igor.liferenko at gmail dot com> ---
Hi all,
This is the proof that current glibc-2.24 sources are incorrect. Quoting libc
reference, last paragraph of section 5.2:
Some of the memory and string functions take single characters as arguments.
Since
a value of type char is automatically promoted into a value of type int when
used as a
parameter, the functions are declared with int as the type of the parameter
in question. In
case of the wide character functions the situation is similar: the parameter
type for a single
wide character is wint_t and not wchar_t. This would for many implementations
not be
necessary since wchar_t is large enough to not be automatically promoted, but
since the
ISO C standard does not require such a choice of types the wint_t type is
used.
Also, in section 6.1 it is said:
wint_t is a data type used for parameters and variables that contain a single
wide
character. As the name suggests this type is the equivalent of int when using
the
normal char strings. The types wchar_t and wint_t often have the same
representation
if their size is 32 bits wide but if wchar_t is defined as char the type
wint_t
must be defined as int due to the parameter promotion.
>From this quote above it follows, that if "wchar_t" is represented as "char"
and
"wint_t" is represented as "int", the functions fputwc(), putwc() and
putwchar() must
be compiled the same way as fputc(), putc() and putchar() - without warnings.
But with current
glibc sources there will be warnings in such case.
If we look at the definitions of analogous non-wide character functions -
fputc(), putc()
and putchar(), we will see that they all take argument of type "int".
Although, according to comment #1 they could reasonably be defined with
argument of type
"char", because we need to check for EOF first anyway.
The purpose that the type is "int" in non-wide character functions is _exactly_
to get rid
of type conversion warnings.
Therefore, I suggest to fix the function interfaces this way:
- wint_t fputwc (wchar_t wc, _IO_FILE *fp) {
+ wint_t fputwc (wint_t wc, _IO_FILE *fp) {
- wint_t putwc (wchar_t wc, _IO_FILE *fp) {
+ wint_t putwc (wint_t wc, _IO_FILE *fp) {
- wint_t putwchar (wchar_t wc) {
+ wint_t putwchar (wint_t wc) {
These fixes will not break any existing code. Explanation follows:
+++++++++++
It is said in libc reference that GNU libc uses UCS-4:
(section 6.1)
in the GNU C Library wchar_t is always 32 bits
(section 6.3.1)
wide character set is always UCS-4 in the GNU C Library
(section 6.5):
in the case of the GNU C Library it is
always UCS-4 encoded ISO 10646
It is also explained in the reference what is UCS-4:
(section 6.1)
UCS-4 is a 32-bit word
(section 6.5.4.2)
UCS-4 value consists of four bytes
Also, in section 6.1 it is written:
ISO 10646 was designed to be a 31-bit large code space
Finally, in section 6.1 it is written:
The macro WEOF evaluates to a constant expression
of type wint_t whose value is different from any
member of the extended character set.
>From the above follows that wchar_t can hold 2^31 extra values different from
any member of the extended character set,
with the first bit being "1":
1xxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
Any of these extra values can be used to represent WEOF.
There will be no problem if wint_t will be signed (and WEOF will be -1), since
both -1 and 0xffffffff have first bit "1", in any representation.
+++++++++++
In fact, the right type for wint_t is already stated in libc reference (section
12.15):
With the GNU C Library, WEOF is -1.
(because for comparison with -1, wint_t must be signed)
But in the current glibc-2.24 sources (wctype.h and wchar.h), WEOF is unsigned:
# define WEOF (0xffffffffu)
This is an inconsistency.
Regards,
Igor
--
You are receiving this mail because:
You are on the CC list for the bug.