This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][BZ #5298] Don't flush write buffer for ftell


On Mon, 24 Sep 2012 10:49:58 -0600, Jeff wrote:
> That's how I typically see that formatted for GCC.  That'll also
> cause a minor formatting change to the comment and return EOF line.
> Similarly for the wfileops.c implementation.  If glibc's style
> guidelines are different, then please ignore my comments.

My mistake. I don't remember what I was thinking when I formatted it
that way, but it is obviously wrong.

> The normal version is quite simple; the wide version is considerably 
> more complex.  Did you do any additional testing on the wide version 
> beyond what's already in the glibc testsuite?

The wide version has the additional overhead of conversion to find the
actual offset, but other than that the logic should be the same.  I
tested this with the test cases in the wide fseek patch, which tests
cases of ftell after seeking to various places as well as writing to
the stream. The tst-rewind* test cases in the testsuite should test
ftell after rewind.

> NEWS file will need 5298 added to the list of bugs fixed.

Will do on commit.

> In fileops.c:
> 
> +	  /* We're doing ftell and we're in write mode.  Add the
> unflushed
> +	     buffer contents to the file position and the amount by
> which the
> +	     file offset would have moved if the write had flushed.
> The
> +	     latter is usually non-positive.  */
> +	  offset += ((fp->_IO_write_ptr - fp->_IO_write_base)
> +		     + fp->_IO_write_base - fp->_IO_read_end);
> 
> I had to look at that several times.  The subtraction of _read_end
> from _write_base certainly looks odd.  You do something similar in 
> wfileops.c.  I wasn't able to convince myself it was correct, as long
> as you're confident it's correct, I won't object.

While writing an explanation for this, I realized that this could be
made simpler.

When fp._offset is set (which occurs when there has been a prior read),
_IO_read_end is always set to reflect its position in the buffer and
the buffer may look like this in write mode:

_IO_read_end,fp._offset,	_IO_write_ptr  
_IO_write_base			   |
	|			   |
	V			   V
    ------------------------------------
    |   |                          |   |
    ------------------------------------

We need the position at _IO_write_ptr.  Given fp._offset and the fact
that _IO_read_end coincides with it, this should be:

fp._offset - (_IO_read_end - _IO_write_ptr)

which is a simplified version of what I have coded in.  Another
interesting state for the buffer is:

    _IO_write_base  _IO_write_ptr  _IO_read_end, fp._offset
	|		|	       |
	V		V	       V
    ----------------------------------------
    |   |               |              |   |
    ----------------------------------------

This is when you have a read followed by a seek to somewhere in the
middle of the buffer, followed by a write.  Even in this case, the
above holds true.  The other case of only write is trivial since
_IO_write_base coincides with _IO_read_end and we end up with
unbuffered content, which can be added to the real offset of the file
that we get using lseek.

I have modified this section accordingly and also added a comment to
explain why _IO_read_end is there.  Updated patch attached.

Regards,
Siddhesh

ChangeLog:

	[BZ #5298]
	* libio/fileops.c (_IO_new_file_seekoff): Don't flush buffer
	for ftell.  Compute offsets from write pointers instead.
	* libio/wfileops.c (_IO_wfile_seekoff): Likewise.
diff --git a/libio/fileops.c b/libio/fileops.c
index e22efde..173091c 100644
--- a/libio/fileops.c
+++ b/libio/fileops.c
@@ -978,6 +978,9 @@ _IO_new_file_seekoff (fp, offset, dir, mode)
   int must_be_exact = (fp->_IO_read_base == fp->_IO_read_end
 		       && fp->_IO_write_base == fp->_IO_write_ptr);
 
+  bool was_writing = (fp->_IO_write_ptr > fp->_IO_write_base
+		      || _IO_in_put_mode (fp));
+
   if (mode == 0)
     dir = _IO_seek_cur, offset = 0; /* Don't move any pointers. */
 
@@ -988,10 +991,8 @@ _IO_new_file_seekoff (fp, offset, dir, mode)
      which assumes file_ptr() is eGptr.  Anyway, since we probably
      end up flushing when we close(), it doesn't make much difference.)
      FIXME: simulate mem-mapped files. */
-
-  if (fp->_IO_write_ptr > fp->_IO_write_base || _IO_in_put_mode (fp))
-    if (_IO_switch_to_get_mode (fp))
-      return EOF;
+  else if (was_writing && _IO_switch_to_get_mode (fp))
+    return EOF;
 
   if (fp->_IO_buf_base == NULL)
     {
@@ -1010,7 +1011,17 @@ _IO_new_file_seekoff (fp, offset, dir, mode)
     {
     case _IO_seek_cur:
       /* Adjust for read-ahead (bytes is buffer). */
-      offset -= fp->_IO_read_end - fp->_IO_read_ptr;
+      if (mode != 0 || !was_writing)
+	offset -= fp->_IO_read_end - fp->_IO_read_ptr;
+      else
+	{
+	  /* _IO_read_end coincides with fp._offset, so the actual file position
+	     is fp._offset - (_IO_read_end - new_write_ptr).  This is fine
+	     even if fp._offset is not set, since fp->_IO_read_end is then at
+	     _IO_buf_base and this adjustment is for unbuffered output.  */
+	  offset -= fp->_IO_read_end - fp->_IO_write_ptr;
+	}
+
       if (fp->_offset == _IO_pos_BAD)
 	{
 	  if (mode != 0)
diff --git a/libio/wfileops.c b/libio/wfileops.c
index 1a7b731..ae758d9 100644
--- a/libio/wfileops.c
+++ b/libio/wfileops.c
@@ -611,6 +611,10 @@ _IO_wfile_seekoff (fp, offset, dir, mode)
 		       && (fp->_wide_data->_IO_write_base
 			   == fp->_wide_data->_IO_write_ptr));
 
+  bool was_writing = ((fp->_wide_data->_IO_write_ptr
+		       > fp->_wide_data->_IO_write_base)
+		      || _IO_in_put_mode (fp));
+
   if (mode == 0)
     {
       /* XXX For wide stream with backup store it is not very
@@ -642,11 +646,8 @@ _IO_wfile_seekoff (fp, offset, dir, mode)
      which assumes file_ptr() is eGptr.  Anyway, since we probably
      end up flushing when we close(), it doesn't make much difference.)
      FIXME: simulate mem-mapped files. */
-
-  if (fp->_wide_data->_IO_write_ptr > fp->_wide_data->_IO_write_base
-      || _IO_in_put_mode (fp))
-    if (_IO_switch_to_wget_mode (fp))
-      return WEOF;
+  else if (was_writing && _IO_switch_to_wget_mode (fp))
+    return WEOF;
 
   if (fp->_wide_data->_IO_buf_base == NULL)
     {
@@ -677,29 +678,104 @@ _IO_wfile_seekoff (fp, offset, dir, mode)
       cv = fp->_codecvt;
       clen = (*cv->__codecvt_do_encoding) (cv);
 
-      if (clen > 0)
+      if (mode != 0 || !was_writing)
 	{
-	  offset -= (fp->_wide_data->_IO_read_end
-		     - fp->_wide_data->_IO_read_ptr) * clen;
-	  /* Adjust by readahead in external buffer.  */
-	  offset -= fp->_IO_read_end - fp->_IO_read_ptr;
+	  if (clen > 0)
+	    {
+	      offset -= (fp->_wide_data->_IO_read_end
+			 - fp->_wide_data->_IO_read_ptr) * clen;
+	      /* Adjust by readahead in external buffer.  */
+	      offset -= fp->_IO_read_end - fp->_IO_read_ptr;
+	    }
+	  else
+	    {
+	      int nread;
+
+	    flushed:
+	      delta = (fp->_wide_data->_IO_read_ptr
+		       - fp->_wide_data->_IO_read_base);
+	      fp->_wide_data->_IO_state = fp->_wide_data->_IO_last_state;
+	      nread = (*cv->__codecvt_do_length) (cv,
+						  &fp->_wide_data->_IO_state,
+						  fp->_IO_read_base,
+						  fp->_IO_read_end, delta);
+	      fp->_IO_read_ptr = fp->_IO_read_base + nread;
+	      fp->_wide_data->_IO_read_end = fp->_wide_data->_IO_read_ptr;
+	      offset -= fp->_IO_read_end - fp->_IO_read_base - nread;
+	    }
 	}
       else
 	{
-	  int nread;
+	  char *new_write_ptr = fp->_IO_write_ptr;
 
-	  delta = fp->_wide_data->_IO_read_ptr - fp->_wide_data->_IO_read_base;
-	  fp->_wide_data->_IO_state = fp->_wide_data->_IO_last_state;
-	  nread = (*cv->__codecvt_do_length) (cv, &fp->_wide_data->_IO_state,
-					      fp->_IO_read_base,
-					      fp->_IO_read_end, delta);
-	  fp->_IO_read_ptr = fp->_IO_read_base + nread;
-	  fp->_wide_data->_IO_read_end = fp->_wide_data->_IO_read_ptr;
-	  offset -= fp->_IO_read_end - fp->_IO_read_base - nread;
+	  if (clen > 0)
+	    offset += (fp->_wide_data->_IO_write_ptr
+		       - fp->_wide_data->_IO_write_base) / clen;
+	  else
+	    {
+	      enum __codecvt_result status;
+	      delta = (fp->_wide_data->_IO_write_ptr
+		       - fp->_wide_data->_IO_write_base);
+	      const wchar_t *write_base = fp->_wide_data->_IO_write_base;
+
+	      /* FIXME: This actually ends up in two iterations of conversion,
+		 one here and the next when the buffer actually gets flushed.
+		 It may be possible to optimize this in future so that
+		 wdo_write identifies already converted content and does not
+		 redo it.  In any case, this is much better than having to
+		 flush buffers for every ftell.  */
+	      do
+		{
+		  /* Ugh, no point trying to avoid the flush.  Just do it
+		     and go back to how it was with the read mode.  */
+		  if (delta > 0 && new_write_ptr == fp->_IO_buf_end)
+		    {
+		      if (_IO_switch_to_wget_mode (fp))
+			return WEOF;
+		      goto flushed;
+		    }
+
+		  const wchar_t *new_wbase = fp->_wide_data->_IO_write_base;
+		  fp->_wide_data->_IO_state = fp->_wide_data->_IO_last_state;
+		  status = (*cv->__codecvt_do_out) (cv,
+						    &fp->_wide_data->_IO_state,
+						    write_base,
+						    write_base + delta,
+						    &new_wbase,
+						    new_write_ptr,
+						    fp->_IO_buf_end,
+						    &new_write_ptr);
+
+		  delta -= new_wbase - write_base;
+
+		  /* If there was an error, then return WEOF.
+		     TODO: set buffer state.  */
+		  if (__builtin_expect (status == __codecvt_error, 0))
+		      return WEOF;
+		}
+	      while (delta > 0);
+	    }
+
+	  /* _IO_read_end coincides with fp._offset, so the actual file position
+	     is fp._offset - (_IO_read_end - new_write_ptr).  This is fine
+	     even if fp._offset is not set, since fp->_IO_read_end is then at
+	     _IO_buf_base and this adjustment is for unbuffered output.  */
+	  offset -= fp->_IO_read_end - new_write_ptr;
 	}
 
       if (fp->_offset == _IO_pos_BAD)
-	goto dumb;
+	{
+	  if (mode != 0)
+	    goto dumb;
+	  else
+	    {
+	      result = _IO_SYSSEEK (fp, 0, dir);
+	      if (result == EOF)
+		return result;
+	      fp->_offset = result;
+	    }
+	}
+
       /* Make offset absolute, assuming current pointer is file_ptr(). */
       offset += fp->_offset;
 

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]