This is the mail archive of the
newlib@sourceware.org
mailing list for the newlib project.
Re: stdio bug
Jeff Johnston <jjohnstn <at> redhat.com> writes:
>
> Eric Blake wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > Originally reported to cygwin, but I think this is a newlib bug. I
> > noticed that fflush sets __SNPT when the file needs an lseek before the
> > next read or write; I think that making fclose perform the lseek after
> > fflush if __SNPT is set will resolve the situation. But I haven't had
> > time to test a patch along these lines yet.
> >
>
> A bit of a sore-point, but I would like to reiterate: "newlib is not
> POSIX-compliant". If you go through the POSIX standard, you will find
> quite a few things that newlib doesn't comply with. That said, newlib
> does not have a bug, but there is a POSIX feature missing that Cygwin
> needs to have implemented.
>
> The POSIX read and write functionality you think is there, actually
> isn't. Currently, only fseek is looking at the SNPT flag to determine
> if it has to do an lseek instead of using the buffer. This was added as
> part of a fix for a Cygwin SUSV3-compliance issue:
> http://sourceware.org/ml/newlib/2006/msg00100.html
>
> I think a reasonable alternative would be to first have fflush empty the
> read buffer and to perform the lseek then and there if the current
> buffer is not empty or exhausted. Any subsequent read would comply with
> the POSIX behavior and close/read logic wouldn't need any changing.
> Multiple fflushing a read file wouldn't result in multiple lseeks since
> the buffer would be cleared after the first call. Would do you think?
>
After spending a couple hours stepping through a debug image today, I think I
can agree that making fflush always perform the lseek when discarding buffered
read data would work and still comply with POSIX, even though it would be
giving up the allowed optimization that the underlying lseek is only guaranteed
to occur when fflush is followed by fseek (and possibly ftell) with no other
intervening stream functions. But you would still need to patch fclose to
unconditionally call fflush, since the bug is on read-streams, and fclose
currently doesn't flush them.
I found another newlib issue in my attempts to play with this. The __SOFF
optimization (aka 'known seek offset' optimization in stdio.c) violates POSIX
semantics any time a file descriptor is duplicated. POSIX requires
(http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_05.html)
that when two handles exist to the same file description (including by fork),
and set sequences are followed, then the two handles can share a consistent
state of the current lseek offset of the fd. Consider this scenario (the one I
was actually testing while experimenting with m4): the parent app is given
stdin on a seekable file, calls fflush(NULL), then calls system(), giving the
child app the right to use stdin to its heart's content. Then the child exits
normally, such that exit() calls fflush(). Then the parent app does another
fread(stdin). This transfer between the two handles obeys all the POSIX rules,
so the parent app should be reading from the offset that the child left stdin
at. But since newlib recognized that stdin was seekable in the parent, it
turned on the __SOFF flag during the first fflush(), so on return from system
(), an ftell(stdin) in the parent returns the cached offset, rather than the
correct offset. The end result was that the parent app re-read the data that
the child already consumed, contrary to POSIX.
The only way I can see to fix this is to only use the __SOFF flag when we know
for sure that nothing else shares a handle to the same file description. That
is, the __SOFF optimization must be disabled for the following FILE*s: at
process startup for stdin, stdout, and stderr (because the fd is inherited from
the parent); and any time a process is about to [v]fork for any FILE*s whose
underlying fd is not marked FD_CLOEXEC (because we don't know what the child
will do with the fd). POSIX also states that using fileno() creates a
duplicate handle, but since both handles then reside within the same process, I
think disabling __SOFF is overly pessimistic in this case since the application
writer should then be following POSIX rules of not changing the offset of the
fd unless they restore its location before using the FILE* again if they want
defined behavior.
--
Eric Blake