This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Implement fmemopen


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to duane ellis on 7/20/2007 5:32 AM:
> Eric Blake wrote:
>> So my implementation needs to be made a bit smarter - when in
>> write-only mode, remember the character being overwritten by the
>> trailing NUL, then when seeking to a different location, restore that
>> byte (ie. the trailing NUL should always correspond to the current
>> position, rather than being a permanent artifact of writing).
> 
> Huh? That seems odd - what if I am writing bin bytes to the memory
> stream that purposely contain a NULL.
> 
> I do not have the POSIX docs - nor have I read them - but my gut tells
> me - I should be able to write bin data to this stream.

The draft POSIX wording requires that any time you fflush() or fclose() a
write-only memstream (whether by fmemopen with mode "w" or "a", or by
open_memstream), that the byte corresponding to the current position be
set to NUL (or in the case of fmemopen, if the current position is eof,
the last byte in the array; open_memstream has no effective eof), so that
you are guaranteed to have a NUL-terminated string.  POSIX does not
prohibit binary data, with embedded NULs, nor does my implementation.
Likewise, for read-write streams (such as fmemopen with mode "w+"), the
requirement is slightly different - a NUL is only written at the current
position (necessarily eof) when the write extends where eof is located.

One problem, then, comes in with enforcing this requirement when no writes
took place.  If you do fflush(fmemopen(array, 1, "w")), then according to
the POSIX rules, array[0] must be '\0' regardless of what it was before
the fmemopen, since the "w" stream was closed while the file position was
at 0.  But the way the newlib stdio routines are set up, calling fflush()
when there is no buffered write waiting to flush will not trigger any
write callbacks to the cookie owner.  So to meet the POSIX requirement, I
have to set offset 0 to '\0' during fmemopen if the stream mode starts
with 'w'.

Another problem is with seeks, since all memstreams are seekable.  If you do:
 char array[] = "foo";
 FILE *f=fmemopen(array, 4, "a"); // include space for trailing NUL
 fseek(f, 0, SEEK_SET);
 fflush(f);
then array[0] must be set to '\0' according to POSIX rules, since the
current position is 0.  Again, there was no buffered write before the
flush.  So the action of seeking should be setting the current byte to
NUL, if the seek destination lies before the end of the stream, under the
assumption that fflush() or fclose() will be called next.

On the other hand, if you do:
 char array[] = "12345";
 FILE *f=fmemopen(array, 5, "w");
 fwrite("boo", 3, 1, f);
 fflush(f);
 fseek(f, 0, SEEK_SET);
 fputc('f', f);
 fseek(f, 2, SEEK_CUR);
at this point, the newlib routines have called the write callback twice -
once for the fflush, and once for the second fseek, and the second fseek
calls both the write callback (to flush the 'f') and the seek callback (to
change the position to 3).  The write callback doesn't know which routine
calls it, so it must behave the same either way, and assume that it was
called by fflush, so in my implementation, the array contents are now 'f',
'\0', 'o', '\0', '5'.  But continuing the example,
 fflush(f);
since there was no fflush between the fputc and the second fseek, POSIX
does not permit the NUL in offset 1; rather, the only NUL must be at
offset 3 corresponding to the current position at the time of the fflush.
 Since there was no explicit write action, the fflush does not call the
write callback.  So the only solution is to make the write and seek
callbacks remember what they are overwriting when inserting a NUL prior to
eof, then make the seek callback check whether the current position before
the seek is prior to eof, in which case, it needs to restore whatever byte
had previously been temporarily set to NUL.

My current implementation (although I'm still testing it), with the
semantics I described above, is attached.  Changes from my original
submission are the addition of the saved character (as manipulated in the
write and seek callback), the reader now returns EOF if reading beyond
eof, and the writer ensures that any bytes skipped between eof and the
current position are properly set to NUL.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             ebb9@byu.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGoK3c84KuGfSFAYARApYmAKC1IDqm0TdqHFQXmZV5iPwLtRCbpgCg1/Vm
4eOhegiXo1Cq35y4FtTJnS4=
=lMHe
-----END PGP SIGNATURE-----
Index: libc/stdio/fmemopen.c
===================================================================
RCS file: libc/stdio/fmemopen.c
diff -N libc/stdio/fmemopen.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ libc/stdio/fmemopen.c	19 Jul 2007 22:55:26 -0000
@@ -0,0 +1,370 @@
+/* Copyright (C) 2007 Eric Blake
+ * Permission to use, copy, modify, and distribute this software
+ * is freely granted, provided that this notice is preserved.
+ */
+
+/*
+FUNCTION
+<<fmemopen>>---open a stream around a fixed-length string
+
+INDEX
+	fmemopen
+
+ANSI_SYNOPSIS
+	#include <stdio.h>
+	FILE *fmemopen(void *restrict <[buf]>, size_t <[size]>,
+		       const char *restrict <[mode]>);
+
+DESCRIPTION
+<<fmemopen>> creates a seekable <<FILE>> stream that wraps a
+fixed-length buffer of <[size]> bytes starting at <[buf]>.  The stream
+is opened with <[mode]> treated as in <<fopen>>, where append mode
+starts writing at the first NUL byte.  If <[buf]> is NULL, then
+<[size]> bytes are automatically provided as if by <<malloc>>, with
+the initial size of 0, and <[mode]> must contain <<+>> so that data
+can be read after it is written.
+
+The stream maintains a current position, which moves according to
+bytes read or written, and which can be one past the end of the array.
+The stream also maintains a current file size, which is never greater
+than <[size]>.  If <[mode]> starts with <<r>>, the position starts at
+<<0>>, and file size starts at <[size]> if <[buf]> was provided.  If
+<[mode]> starts with <<w>>, the position and file size start at <<0>>,
+and if <[buf]> was provided, the first byte is set to NUL.  If
+<[mode]> starts with <<a>>, the position and file size start at the
+location of the first NUL byte, or else <[size]> if <[buf]> was
+provided.
+
+When reading, NUL bytes have no significance, and reads cannot exceed
+the current file size.  When writing, the file size can increase up to
+<[size]> as needed, and NUL bytes may be embedded in the stream.  When
+the stream is flushed or closed after a write that changed the file
+size, a NUL byte is written at the current position if there is still
+room; if the stream is not also open for reading, a NUL byte is
+additionally written at the last byte of <[buf]> when the stream has
+exceeded <[size]>, so that a write-only <[buf]> is always
+NUL-terminated when the stream is flushed or closed (and the initial
+<[size]> should take this into account).  It is not possible to seek
+outside the bounds of <[size]>.  A NUL byte written during a flush is
+restored to its previous value when seeking elsewhere in the string.
+
+RETURNS
+The return value is an open FILE pointer on success.  On error,
+<<NULL>> is returned, and <<errno>> will be set to EINVAL if <[size]>
+is zero or <[mode]> is invalid, ENOMEM if <[buf]> was NULL and memory
+could not be allocated, or EMFILE if too many streams are already
+open.
+
+PORTABILITY
+This function is being added to POSIX 200x, but is not in POSIX 2001.
+
+Supporting OS subroutines required: <<sbrk>>.
+*/
+
+#include <stdio.h>
+#include <errno.h>
+#include <string.h>
+#include <sys/lock.h>
+#include "local.h"
+
+/* Describe details of an open memstream.  */
+typedef struct fmemcookie {
+  void *storage; /* storage to free on close */
+  char *buf; /* buffer start */
+  size_t pos; /* current position */
+  size_t eof; /* current file size */
+  size_t max; /* maximum file size */
+  char append; /* nonzero if appending */
+  char writeonly; /* 1 if write-only */
+  char saved; /* saved character that lived at pos before write-only NUL */
+} fmemcookie;
+
+/* Read up to non-zero N bytes into BUF from stream described by
+   COOKIE; return number of bytes read (0 on EOF).  */
+static _READ_WRITE_RETURN_TYPE
+_DEFUN(fmemreader, (ptr, cookie, buf, n),
+       struct _reent *ptr _AND
+       void *cookie _AND
+       char *buf _AND
+       int n)
+{
+  fmemcookie *c = (fmemcookie *) cookie;
+  /* Can't read beyond current size, but EOF condition is not an error.  */
+  if (c->pos > c->eof)
+    return 0;
+  if (n >= c->eof - c->pos)
+    n = c->eof - c->pos;
+  memcpy (buf, c->buf + c->pos, n);
+  c->pos += n;
+  return n;
+}
+
+/* Write up to non-zero N bytes of BUF into the stream described by COOKIE,
+   returning the number of bytes written or EOF on failure.  */
+static _READ_WRITE_RETURN_TYPE
+_DEFUN(fmemwriter, (ptr, cookie, buf, n),
+       struct _reent *ptr _AND
+       void *cookie _AND
+       const char *buf _AND
+       int n)
+{
+  fmemcookie *c = (fmemcookie *) cookie;
+  int adjust = 0; /* true if at EOF, but still need to write NUL.  */
+
+  /* Append always seeks to eof; otherwise, if we have previously done
+     a seek beyond eof, ensure all intermediate bytes are NUL.  */
+  if (c->append)
+    c->pos = c->eof;
+  else if (c->pos > c->eof)
+    memset (c->buf + c->eof, '\0', c->pos - c->eof);
+  /* Do not write beyond EOF; saving room for NUL on write-only stream.  */
+  if (c->pos + n > c->max - c->writeonly)
+    {
+      adjust = c->writeonly;
+      n = c->max - c->pos;
+    }
+  /* Now n is the number of bytes being modified, and adjust is 1 if
+     the last byte is NUL instead of from buf.  Write a NUL if
+     write-only; or if read-write, eof changed, and there is still
+     room.  When we are within the file contents, remember what we
+     overwrite so we can restore it if we seek elsewhere later.  */
+  if (c->pos + n > c->eof)
+    {
+      c->eof = c->pos + n;
+      if (c->eof - adjust < c->max)
+	c->saved = c->buf[c->eof - adjust] = '\0';
+    }
+  else if (c->writeonly)
+    {
+      if (n)
+	{
+	  c->saved = c->buf[c->pos + n - adjust];
+	  c->buf[c->pos + n - adjust] = '\0';
+	}
+      else
+	adjust = 0;
+    }
+  c->pos += n;
+  if (n - adjust)
+    memcpy (c->buf + c->pos - n, buf, n - adjust);
+  else
+    {
+      ptr->_errno = ENOSPC;
+      return EOF;
+    }
+  return n;
+}
+
+/* Seek to position POS relative to WHENCE within stream described by
+   COOKIE; return resulting position or fail with EOF.  */
+static _fpos_t
+_DEFUN(fmemseeker, (ptr, cookie, pos, whence),
+       struct _reent *ptr _AND
+       void *cookie _AND
+       _fpos_t pos _AND
+       int whence)
+{
+  fmemcookie *c = (fmemcookie *) cookie;
+#ifndef __LARGE64_FILES
+  off_t offset = (off_t) pos;
+#else /* __LARGE64_FILES */
+  _off64_t offset = (_off64_t) pos;
+#endif /* __LARGE64_FILES */
+
+  if (whence == SEEK_CUR)
+    offset += c->pos;
+  else if (whence == SEEK_END)
+    offset += c->eof;
+  if (offset < 0)
+    {
+      ptr->_errno = EINVAL;
+      offset = -1;
+    }
+  else if (offset > c->max)
+    {
+      ptr->_errno = ENOSPC;
+      offset = -1;
+    }
+#ifdef __LARGE64_FILES
+  else if ((_fpos_t)offset != offset)
+    {
+      ptr->_errno = EOVERFLOW;
+      offset = -1;
+    }
+#endif /* __LARGE64_FILES */
+  else
+    {
+      if (c->writeonly && c->pos < c->eof)
+	{
+	  c->buf[c->pos] = c->saved;
+	  c->saved = '\0';
+	}
+      c->pos = offset;
+      if (c->writeonly && c->pos < c->eof)
+	{
+	  c->saved = c->buf[c->pos];
+	  c->buf[c->pos] = '\0';
+	}
+    }
+  return (_fpos_t) offset;
+}
+
+/* Seek to position POS relative to WHENCE within stream described by
+   COOKIE; return resulting position or fail with EOF.  */
+#ifdef __LARGE64_FILES
+static _fpos64_t
+_DEFUN(fmemseeker64, (ptr, cookie, pos, whence),
+       struct _reent *ptr _AND
+       void *cookie _AND
+       _fpos64_t pos _AND
+       int whence)
+{
+  _off64_t offset = (_off64_t) pos;
+  fmemcookie *c = (fmemcookie *) cookie;
+  if (whence == SEEK_CUR)
+    offset += c->pos;
+  else if (whence == SEEK_END)
+    offset += c->eof;
+  if (offset < 0)
+    {
+      ptr->_errno = EINVAL;
+      offset = -1;
+    }
+  else if (offset > c->max)
+    {
+      ptr->_errno = ENOSPC;
+      offset = -1;
+    }
+  else
+    {
+      if (c->writeonly && c->pos < c->eof)
+	{
+	  c->buf[c->pos] = c->saved;
+	  c->saved = '\0';
+	}
+      c->pos = offset;
+      if (c->writeonly && c->pos < c->eof)
+	{
+	  c->saved = c->buf[c->pos];
+	  c->buf[c->pos] = '\0';
+	}
+    }
+  return (_fpos64_t) offset;
+}
+#endif /* __LARGE64_FILES */
+
+/* Reclaim resources used by stream described by COOKIE.  */
+static int
+_DEFUN(fmemcloser, (ptr, cookie),
+       struct _reent *ptr _AND
+       void *cookie)
+{
+  fmemcookie *c = (fmemcookie *) cookie;
+  _free_r (ptr, c->storage);
+  return 0;
+}
+
+/* Open a memstream around buffer BUF of SIZE bytes, using MODE.
+   Return the new stream, or fail with NULL.  */
+FILE *
+_DEFUN(_fmemopen_r, (ptr, buf, size, mode),
+       struct _reent *ptr _AND
+       void *buf _AND
+       size_t size _AND
+       const char *mode)
+{
+  FILE *fp;
+  fmemcookie *c;
+  int flags;
+  int dummy;
+
+  if ((flags = __sflags (ptr, mode, &dummy)) == 0)
+    return NULL;
+  if (!size || !(buf || flags & __SAPP))
+    {
+      ptr->_errno = EINVAL;
+      return NULL;
+    }
+  if ((fp = __sfp (ptr)) == NULL)
+    return NULL;
+  if ((c = (fmemcookie *) _malloc_r (ptr, sizeof *c + (buf ? 0 : size)))
+      == NULL)
+    {
+      __sfp_lock_acquire ();
+      fp->_flags = 0;		/* release */
+#ifndef __SINGLE_THREAD__
+      __lock_close_recursive (fp->_lock);
+#endif
+      __sfp_lock_release ();
+      return NULL;
+    }
+
+  c->storage = c;
+  c->max = size;
+  /* 9 modes to worry about.  */
+  /* w/a, buf or no buf: Guarantee a NUL after any file writes.  */
+  c->writeonly = (flags & __SWR) != 0;
+  c->saved = '\0';
+  if (!buf)
+    {
+      /* r+/w+/a+, and no buf: file starts empty.  */
+      c->buf = (char *) (c + 1);
+      *(char *) buf = '\0';
+      c->pos = c->eof = 0;
+      c->append = (flags & __SAPP) != 0;
+    }
+  else
+    {
+      c->buf = (char *) buf;
+      switch (*mode)
+	{
+	case 'a':
+	  /* a/a+ and buf: position and size at first NUL.  */
+	  buf = memchr (c->buf, '\0', size);
+	  c->eof = c->pos = buf ? (char *) buf - c->buf : size;
+	  if (!buf && c->writeonly)
+	    /* a: guarantee a NUL within size even if no writes.  */
+	    c->buf[size - 1] = '\0';
+	  c->append = 1;
+	  break;
+	case 'r':
+	  /* r/r+ and buf: read at beginning, full size available.  */
+	  c->pos = c->append = 0;
+	  c->eof = size;
+	  break;
+	case 'w':
+	  /* w/w+ and buf: write at beginning, truncate to empty.  */
+	  c->pos = c->append = c->eof = 0;
+	  *c->buf = '\0';
+	  break;
+	default:
+	  abort ();
+	}
+    }
+
+  _flockfile (fp);
+  fp->_file = -1;
+  fp->_flags = flags;
+  fp->_cookie = c;
+  fp->_read = fmemreader;
+  fp->_write = fmemwriter;
+  fp->_seek = fmemseeker;
+#ifdef __LARGE64_FILES
+  fp->_seek64 = fmemseeker64;
+  fp->_flags |= __SL64;
+#endif
+  fp->_close = fmemcloser;
+  _funlockfile (fp);
+  return fp;
+}
+
+#ifndef _REENT_ONLY
+FILE *
+_DEFUN(fmemopen, (buf, size, mode),
+       void *buf _AND
+       size_t size _AND
+       const char *mode)
+{
+  return _fmemopen_r (_REENT, buf, size, mode);
+}
+#endif /* !_REENT_ONLY */
Index: libc/stdio/stdio.tex
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdio/stdio.tex,v
retrieving revision 1.9
diff -u -p -r1.9 stdio.tex
--- libc/stdio/stdio.tex	19 Jul 2007 03:42:21 -0000	1.9
+++ libc/stdio/stdio.tex	19 Jul 2007 22:55:26 -0000
@@ -37,6 +37,7 @@ structure.
 * fgetpos::     Record position in a stream or file
 * fgets::       Get character string from a file or stream
 * fileno::      Get file descriptor associated with stream
+* fmemopen::    Open a stream around a fixed-length buffer
 * fopen::       Open a file
 * fopencookie:: Open a stream with custom callbacks
 * fputc::       Write a character on a stream or file
@@ -124,6 +125,9 @@ structure.
 @include stdio/fileno.def
 
 @page
+@include stdio/fmemopen.def
+
+@page
 @include stdio/fopen.def
 
 @page
Index: libc/stdio/Makefile.am
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdio/Makefile.am,v
retrieving revision 1.26
diff -u -p -r1.26 Makefile.am
--- libc/stdio/Makefile.am	13 Jul 2007 17:07:28 -0000	1.26
+++ libc/stdio/Makefile.am	19 Jul 2007 22:55:26 -0000
@@ -117,6 +117,7 @@ ELIX_4_SOURCES = \
 	asnprintf.c		\
 	diprintf.c		\
 	dprintf.c		\
+	fmemopen.c		\
 	fopencookie.c		\
 	funopen.c		\
 	vasniprintf.c		\
@@ -179,6 +180,7 @@ CHEWOUT_FILES = \
 	fgetpos.def		\
 	fgets.def		\
 	fileno.def		\
+	fmemopen.def		\
 	fopen.def		\
 	fopencookie.def		\
 	fputc.def		\
@@ -244,6 +246,7 @@ $(lpfx)fclose.$(oext): local.h
 $(lpfx)fdopen.$(oext): local.h
 $(lpfx)fflush.$(oext): local.h
 $(lpfx)findfp.$(oext): local.h
+$(lpfx)fmemopen.$(oext): local.h
 $(lpfx)fopen.$(oext): local.h
 $(lpfx)fopencookie.$(oext): local.h
 $(lpfx)fputs.$(oext): fvwrite.h
Index: libc/include/stdio.h
===================================================================
RCS file: /cvs/src/src/newlib/libc/include/stdio.h,v
retrieving revision 1.47
diff -u -p -r1.47 stdio.h
--- libc/include/stdio.h	13 Jul 2007 20:37:53 -0000	1.47
+++ libc/include/stdio.h	19 Jul 2007 22:55:26 -0000
@@ -244,11 +244,9 @@ char *	_EXFUN(asnprintf, (char *, size_t
                _ATTRIBUTE ((__format__ (__printf__, 3, 4))));
 int	_EXFUN(asprintf, (char **, const char *, ...)
                _ATTRIBUTE ((__format__ (__printf__, 2, 3))));
-#ifndef dprintf
+#ifndef diprintf
 int	_EXFUN(diprintf, (int, const char *, ...)
                _ATTRIBUTE ((__format__ (__printf__, 2, 3))));
-int	_EXFUN(dprintf, (int, const char *, ...)
-               _ATTRIBUTE ((__format__ (__printf__, 2, 3))));
 #endif
 int	_EXFUN(fcloseall, (_VOID));
 int	_EXFUN(fiprintf, (FILE *, const char *, ...)
@@ -278,8 +276,6 @@ int	_EXFUN(vasprintf, (char **, const ch
                _ATTRIBUTE ((__format__ (__printf__, 2, 0))));
 int	_EXFUN(vdiprintf, (int, const char *, __VALIST)
                _ATTRIBUTE ((__format__ (__printf__, 2, 0))));
-int	_EXFUN(vdprintf, (int, const char *, __VALIST)
-               _ATTRIBUTE ((__format__ (__printf__, 2, 0))));
 int	_EXFUN(vfiprintf, (FILE *, const char *, __VALIST)
                _ATTRIBUTE ((__format__ (__printf__, 2, 0))));
 int	_EXFUN(vfiscanf, (FILE *, const char *, __VALIST)
@@ -306,7 +302,7 @@ int	_EXFUN(vsscanf, (const char *, const
 #endif /* !__STRICT_ANSI__ */
 
 /*
- * Routines in POSIX 1003.1.
+ * Routines in POSIX 1003.1:2001.
  */
 
 #ifndef __STRICT_ANSI__
@@ -330,6 +326,24 @@ int	_EXFUN(putchar_unlocked, (int));
 #endif /* ! __STRICT_ANSI__ */
 
 /*
+ * Routines in POSIX 1003.1:200x.
+ */
+
+#ifndef __STRICT_ANSI__
+# ifndef _REENT_ONLY
+int	_EXFUN(dprintf, (int, const char *, ...)
+               _ATTRIBUTE ((__format__ (__printf__, 2, 3))));
+FILE *	_EXFUN(fmemopen, (void *, size_t, const char *));
+/* getdelim - see __getdelim for now */
+/* getline - see __getline for now */
+/* open_memstream - unimplemented for now, but see funopen */
+/* renameat - unimplemented for now */
+int	_EXFUN(vdprintf, (int, const char *, __VALIST)
+               _ATTRIBUTE ((__format__ (__printf__, 2, 0))));
+# endif
+#endif
+
+/*
  * Recursive versions of the above.
  */
 
@@ -354,6 +368,7 @@ int	_EXFUN(_fiprintf_r, (struct _reent *
                _ATTRIBUTE ((__format__ (__printf__, 3, 4))));
 int	_EXFUN(_fiscanf_r, (struct _reent *, FILE *, const char *, ...)
                _ATTRIBUTE ((__format__ (__scanf__, 3, 4))));
+FILE *	_EXFUN(_fmemopen_r, (struct _reent *, void *, size_t, const char *));
 FILE *	_EXFUN(_fopen_r, (struct _reent *, const char *, const char *));
 int	_EXFUN(_fprintf_r, (struct _reent *, FILE *, const char *, ...)
                _ATTRIBUTE ((__format__ (__printf__, 3, 4))));

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]