Handling C2x binary integer I/O
Joseph Myers
joseph@codesourcery.com
Fri Dec 4 17:42:25 GMT 2020
C2x has support for binary integer constants starting 0b (accepted at the
October WG14 meeting, not yet in the main branch in the C standard git
repository). By itself that's a language feature not a library one,
except that strtol with base 0 accepts all unsuffixed integer constants,
so binary constants imply it needs to handle 0b, and so then does scanf
%i. At today's WG14 meeting there was support for further related
features (strtol accepting optional 0b prefix in base 2, printf/scanf %b
for binary), though a further paper on that will be needed at the March
meeting to decide on those features.
How do we wish to handle these features in glibc? New printf/scanf
formats pose no problems, but changes to strings accepted by strtol are in
principle an incompatible change: strings starting 0b are required to be
handled differently in standards before C2x than in C2x. We don't have
different symbols in glibc to support pre-C99 and C99 strtod (C99
introduced support for hex input to strtod), but do have different symbols
for scanf %a (C99 feature, previously used as a GNU extension for memory
allocation for a string).
Keeping full compatibility with pre-C2x code would indicate having
separate versions of all affected symbols - presumably including those
that are extensions, not just those that are actually in the C2x standard,
as it would seem very confusing for e.g. strtol and strtol_l to behave
differently in this regard. That is, there would be __isoc23_* versions
(C2x is expected to be published as C23) of the following 32 functions:
strtol strtoll strtoul strtoull strtol_l strtoll_l strtoul_l strtoull_l
strtoimax strtoumax fscanf scanf sscanf vscanf vsscanf vfscanf
wcstol wcstoll wcstoul wcstoull wcstol_l wcstoll_l wcstoul_l wcstoull_l
wcstoimax wcstoumax fwscanf wscanf swscanf vfwscanf vwscanf vswscanf
(Platforms with two long double variants would have 44 new functions, and
powerpc64le would have 56 new functions, because the scanf functions also
need replicating for each long double variant. The number of function
names could be reduced by 4, at the cost of more header complexity if e.g.
strtoimax gets mapped to __isoc23_strtoll rather than needing
__isoc23_strtoimax; likewise, 8 more variants could be avoided on systems
where long and long long are both 64-bit, by using the same __isoc23 names
there. But the long / long long case would only work with the correct
types given a real __REDIRECT implementation; the fallback #define in the
absence of __REDIRECT would give one function the wrong type. Given that
the header support for missing __REDIRECT support is probably broken
anyway, that may not matter.)
There are also the following functions:
__strtol_internal __strtoul_internal __strtoll_internal __strtoull_internal
__wcstol_internal __wcstoul_internal __wcstoll_internal __wcstoull_internal
The only public use of these (i.e. in installed headers) is for inline
versions of functions such as strtoimax in inttypes.h. Those inlines were
left behind when such inlines for other strto* etc. functions were removed
in glibc 2.7. Although they were apparently left behind deliberately, I
don't think it really makes sense to have inline versions of those few
inttypes.h functions (much more rarely used than the functions that are
not inlined); I think that rather than adding __isoc23_* versions of these
*_internal functions, the inlines should be removed. (And we could
consider independently (a) whether those *_internal functions should
become compat symbols and (b) whether to make the *max functions into
proper aliases of the strtol/strtoll etc. functions rather than thin
wrappers round *_internal functions.)
--
Joseph S. Myers
joseph@codesourcery.com
More information about the Libc-alpha
mailing list