This is the mail archive of the
mailing list for the glibc project.
Re: What to do about gnulib libio dependencies?
- From: Bruno Haible <bruno at clisp dot org>
- To: Paul Eggert <eggert at cs dot ucla dot edu>
- Cc: Zack Weinberg <zackw at panix dot com>, Carlos O'Donell <carlos at redhat dot com>, Szabolcs Nagy <szabolcs dot nagy at arm dot com>, Florian Weimer <fweimer at redhat dot com>, GNU C Library <libc-alpha at sourceware dot org>, nd <nd at arm dot com>, Gnulib bugs <bug-gnulib at gnu dot org>
- Date: Wed, 22 Aug 2018 04:59:27 +0200
- Subject: Re: What to do about gnulib libio dependencies?
- References: <email@example.com> <CAKCAbMhO19-oOUmwFvm5CJDa3wgV==Ax56o+ybTG9Bkpm8Q4_w@mail.gmail.com> <firstname.lastname@example.org>
Zack Weinberg wrote:
> > I think it would clarify this discussion if you gave concrete examples
> > of existing programs that use these functions, and described what they
> > are doing with them that can't be accomplished using the standard
> > interfaces.
Of course, it makes sense to review
- whether the functions are useful,
- whether the API is adequate.
* freadptr and freadseek are performance boosters for applications that
can benefit from dealing with an entire buffer at once, rather than
reading and handling byte after byte. GNU m4 makes use of it and got
a 17% speedup from it . Other programs (from 'iconv' to JSON parsers)
surely could make use of it too. So far, programs which want to handle
entire buffers of input at once ignore the stdio and operate on file
descriptors. I feel this is unfortunate: you should be able to use
stdio AND get decent performance when possible.
Also, gnulib uses this facility in its 'getndelim2' function, which is
a generalization of 'getdelim' from glibc. Glibc's 'getdelim' implementation
uses this trick already. It's a pity if functions like 'getdelim', when
defined in applications, can not have the same speed as the same functions
* freadahead currently has two uses:
- It's used by the implementation of freadseek.
- As an optimization that removes one system call just before the
termination of most coreutils programs. (See  line 83.)
* fseterr is needed when an application wants to implement functions that,
like fprintf, set the error indicator on a FILE stream in certain
conditions (such as an invalid argument or out-of-memory).
POSIX provides the 'ferror' and 'clearerr' functions; fseterr is in the
There are ways to implement this function in a portable but very expensive
way, see  lines 57..80. But no one wants such an expensive implementation.
A similar situation occurs with setvbuf: POSIX standardized the API to
set the buffering mode and size. Glibc has a function __fbufsize to retrieve
the buffer size, but no function to retrieve the buffering mode.
Note: This function is not fully 100% portable: On native Windows, it is
impossible to distinguish a stream in _IOFBF mode and a stream in _IOLBF
This function is not currently used by any application I know of. But it
complements __fbufsize which is already in glibc.
Florian Weimer writes:
> freadahead and freadptr are problematic for wide-oriented streams,
> but we have ABI exposure for the read pointers already for the inline
> copy of fputc_unlocked. The only caveat is that for fputc_unlocked,
> we can provide compatibility by always having an empty read buffer
> (at a cost to performance). With the other interfaces, this might
> not be a possibility.
In situations where you can't support freadptr and freadseek (e.g.
if the stream is unbuffered, or if you have chosen to store the bytes
in reverse order in memory, or XORed with some value, or whatever),
you can make freadptr return NULL. That's just an indicator to the
application that tells it "no optimization is possible - use a
classical fgetc loop".
Regarding freadahead, it does not constrain the implementation:
"Returns the number of bytes waiting in the input buffer of STREAM.
This includes both the bytes that have been read from the underlying
input source and the bytes that have been pushed back through 'ungetc'."
The function just returns a number; there is no guarantee that the
bytes "waiting in the input buffer" are stored in a certain way or in
a certain place.
Carlos O'Donell writes:
> If we implement these interfaces in glibc can we avoid this situation
> happening again in the future?
You can at least try to avoid such situations by providing an API that
is well thought-out. Each time some code in glibc makes use of glibc
internals, ask yourself whether application programs can use the same
facilities and, if not, what they lose without these facilities.