unbuffered fread() deadlock

Tue Jan 13 19:11:00 GMT 2009

Craig,

  I am pretty sure that's where this came from.

  Now, that said.  Why is it there?  From my old IBM C/370 library days 
I seem to remember this
was meant for terminal I/O.  Consider the following test case:

#include <stdio.h>
#include <string.h>

int main() {
    int a;
    printf("a is ");
    scanf("%d", &a);
    printf("b is ");
    fprintf(stderr, "c is 5\n");
    return 0;
}

This test when run under newlib and glibc runs as follows (where I enter 
the number 3)

a is 3
c is 5
b is <prompt>

Note how the stdout buffer gets flushed before the input for a is 
given.  If those lines are removed
you won't see the "a is" being output before the input is requested.  
Again, note that this behavior
is the same under glibc.

Now, should all streams be flushed?  The answer is no.  Only the 
terminal I/O.

Modifying the test case to write to a file that is line-buffered has a 
different result on glibc than on newlib.
Newlib flushes the file when the terminal I/O is requested, but glibc 
does not.

So, what I believe should be happening is that if terminal I/O is 
requested from the host, then all line-buffered terminal I/O should be 
flushed.  I believe the furthermore clause relates to line-buffered I/O 
since the previous sentence is talking about line-buffered I/O but I can 
always test this against glibc.

-- Jeff J.

Howland Craig D (Craig) wrote:
> Jeff:
>      I don't read that the furthermore says anything like what the
> comment does.  (I had read it while composing my response--more than
> once--and it never occurred to me that it might be the source of the
> refill.c comment.  Perhaps you are right, and the 'furthermore' was the
> source of the refill.c comment, but if it is, I don't think that it is
> a correct understanding.)  I'll annotate it a little as I understand it.
> Original:
> "Furthermore, characters are intended to be transmitted as a block to
> the host environment when a buffer is filled, when input is requested
> on an unbuffered stream, or when input is requested on a line buffered
> stream that requires the transmission of characters from the host
> environment."
> Re-format their long sentence, and add 5 annotations:
> "Furthermore, characters are intended to be transmitted as a block to
> the host environment when[:]
> [1] a buffer is filled, [or]
> [2] when input is requested on an unbuffered stream, or 
> [3] when input is requested on a line buffered stream that requires
> the transmission of characters from the host environment."
>      I think that what they're trying to say (not very clearly) could
> perhaps be paraphrased as 'it is intended that whenever characters
> do need to be moved in a stream--regardless of buffering mode--that
> these moves should use blocks of characters as possible for the sake of
> efficiency.'  (The second point might not appear to make any sense--how
> can you move a block on an un-buffered stream?--but it can if the device
> has a buffer.  A device having its own buffer is a good reason to not
> have
> the file stream add one.  So a read on a non-stream-buffered Ethernet
> MAC
> could get a block of bytes from the MAC's FIFO, for example.)
>      It appears that you're saying that the statement in point 3, "...
> that requires the transmission of characters from the host environment"
> means that point 3 as a whole should be understood as (in words more
> like
> the disputed refill.c comment) something like:  'when input is requested
> on any line-buffered fd, the output buffers of all fds must be flushed.'
> (That is, 1) you're proposing that point 3 is the source of the comment,
> and, 2) you think that it can understood that way.)
>      I don't think that this follows.  In addition to the arguments
> already given, the last clause of the last sentence in your quote points
> out that setvbuf can affect this behavior.  Nothing in the description
> for
> setvbuf says anything akin to the comment's wording, that is, that would
> link one stream to another stream, or even input to output.
>      I actually stumbled across a document this morning that gives
> rationales behind stuff in the C99 standard.  (See
> http://www.open-std.org/JTC1/SC22/WG14/www/docs/C99RationaleV5.10.pdf.)
> In its section 7.19.3 Files, it says:  "The distinction between buffered
> and unbuffered streams suggests the desired interactive behavior; but an
> implementation may still be conforming even if delays in a network or
> terminal
> controller prevent output from appearing in time. It is the intent that
> matters here."
>      The clearly-stated intents in 7.19.3 (which are all in the quote
> that you supplied)--as opposed to the unclear intent in the furthermore
> sentence--are all aimed at getting characters back and forth in an
> expeditious manner.  (When unbuffered, as soon as possible; when
> fully-buffered, as soon as the buffer is full; when line-buffered, as
> soon as the line is complete.  And, by implication, a partial buffer as
> soon as a flush of it is requested (whether directly via a user fflush
> or indirectly via close or exit or abort).)
>      Perhaps the strongest statement in favor of my interpretation comes
> from the Rationale document section 7.19.5.2 regarding fflush:
> "The fflush function ensures that output has been forced out of internal
> I/O buffers for a specified stream. Occasionally, however, it is
> necessary to ensure that all output is forced out, and the programmer
> may not conveniently be able to specify all the currently open streams,
> perhaps because some streams are manipulated within library packages.9
> To provide an implementation-independent method of flushing all output
> buffers, the Standard specifies that
> this is the result of calling fflush with a NULL argument.  [footnote
> 9:] For instance, on a system (such as UNIX) which supports process
> forks, it is usually necessary to flush all output buffers just prior to
> the fork."
> (Since there is an explicitly-provided mechanism for the "occasional"
> need
> to flush all output streams, why would there be a bizarre implied back
> door method to do so?)
>      Furthermore, from the rationale for fopen:
> "A change of input/output direction on an update file is only allowed
> following a successful fsetpos, fseek, rewind, or fflush operation,
> since these are precisely the functions which assure that the I/O buffer
> has been flushed."
> (If they intended a read to be able to do so, they failed to mention
> it.)
>      And a final argument from the Rationale document.  In their setvbuf
> discussion, they say nothing of linking streams nor directions.  (They
> actually allow FBF to be implemented as LBF (always), or LBF as NBF for
> a binary file.  That is, even though 3 buffering methods are defined,
> an implementation can provide fewer methods, "The general principle is
> to provide portable code with a means of requesting the most appropriate
> popular buffering style, but not to require an implementation to support
> these styles.")  Again, nothing at all that even hints at the intent
> from the refill.c comment.
>      And if I didn't convince you yet, in the first half of the last
> quoted sentence (i.e. that you quoted from C99 7.19.3), it points out
> that
> the behavior of the different buffering modes is implementation-defined.
> So even if the "furthermore" sentence were intended to be understood as
> the refill.c comment says (which I dispute), the implementation can
> choose
> to do as it sees fit.  I submit that the flushing behavior as done does
> not
> make any sense, and is inefficient.  It therefore should be excised as
> being not consistent with the goals of the implementation (namely small
> and efficient).
>      I apologize for this being so lengthy.
> 				Craig
>
> P.S.   If I did not convince you, I submit in advance to your judgement
> as the owner and will not take any more of your time with further
> argument (unless you chose to extend the discussion, of course).
>
> -----Original Message-----
> From: Jeff Johnston [mailto:jjohnstn@redhat.com] 
> Sent: Monday, January 12, 2009 5:07 PM
> To: Howland Craig D (Craig)
> Cc: newlib@sourceware.org; Andre Heider
> Subject: Re: unbuffered fread() deadlock
>
> I believe the source of it is from C99 7.19.3.  Note the furthermore
> clause:
>
> "When a stream is unbuffered, characters are intended to appear from the
>
> source or at the destination
> as soon as possible.  Otherwise characters may be accumulated and 
> transmitted to or from the host
> environment as a block.  when a sgtream is fully buffered, characters 
> are intended to be transmitted to
> or from the host environment as a block when a buffer is filled.  When a
>
> stream is line buffered, characters
> are intended to be transmitted to or from the host environment as a 
> block when a new-line character is
> encountered.  Furthermore, characters are intended to be transmitted as 
> a block to the host environment
> when a buffer is filled, when input is requested on an unbuffered 
> stream, or when input is requested on
> a line buffered stream that requires the transmission of characters from
>
> the host environment.  Support
> for these characteristics is implementation-defined, and may be affected
>
> via the setbuf and setvbuf functions."
>
> The removal of the current stream from the list of fps to walk will 
> solve Andre's problem.  This will occur if we just remove the 
> fp-lock/fp-unlock from fwalk since the lflush function being called for 
> each fp won't call fflush for the current stream that is reading 
> anyway.  However, we still have the sfp-lock vs fp-lock problem looming.
>
>
> A read will lock the fp and then possibly need to acquire the sfp lock.
>
> If something else is doing an fwalk as well (e.g. fflush(null) at same 
> time), the 2nd fwalk may wait for the read fp to be unlocked and never 
> give up the sfp lock that the read fp sits waiting for.
>
> -- Jeff J.
>