open_memstream() ‒ NUL injected on seek/write that doesn't bump the size, I think it can't be per POSIX (and other implementations don't)

DJ Delorie dj@redhat.com
Thu Sep 19 02:20:22 GMT 2024


½°± <nabijaczleweli@nabijaczleweli.xyz> writes:
> On Wed, Sep 18, 2024 at 08:34:05PM -0400, DJ Delorie wrote:
>> fseek() sets the file position.
>> 
>> After fclose(), the size of the data is the smaller of ... the number of
>> bytes between the beginning of the buffer and the current file position
>> indicator.
> (unclear to me what you're quoting here, but POSIX doesn't say this;
>  it only notes the value of *sizep)

https://pubs.opengroup.org/onlinepubs/9799919799/functions/open_memstream.html

  "After a successful fflush() or fclose(), the pointer referenced by
   bufp shall contain the address of the buffer, and the variable
   pointed to by sizep shall contain the smaller of the current buffer
   length and the number of bytes for open_memstream(), or the number of
   wide characters for open_wmemstream(), between the beginning of the
   buffer and the current file position indicator."

>> So if you fseek() to the middle of the written data, and fclose(),
>> you've truncated the "file".
> Yes, but I have /not/ updated the now-one-past-ftell byte to be NUL.
> NULs are only supposed to be inserted on when a write extends the buffer (ll. 51168-51171).
>
> If we interpret POSIX to mean that after fflush the valid range is
> [*bufp, *bufp + *sizep], then with *sizep = 7, the 8-byte result is

  [*bufp, *bufp + *sizep - 1]
or
  [*bufp, *bufp + *sizep)

If *sizep == 7, the result is 7 bytes.

> Now, this is already a very generous-to-glibc interpretation,
> because POSIX doesn't actually say where the result... actually is.
>
> To me, as I read this, it /actually/ resides in
>   [*bufp, *bufp + max-ever-ftell) + a one-past-the-end NUL
> since the buffer is never said to be resized downwards:
> whatever *sizep is set to is irrelevant

Please open a bug with the Austin group asking them to clarify this point.

> (it's convenient if you want to terminate the buffer
>  at where you left the cursor I suppose),
> but has no bearing on the buffer contents.

The API has no way of telling you what the max-ever-ftell was, and you
can't use the trailing NUL to find it as fseek and fwrite can add NULs
themselves.  Which means the only way to know what max-ever-ftell is
after you fclose() it is to keep track of it yourself, and if that's the
case, just fseek() to it before the fclose(), like the POSIX example
does.

> This makes natural sense if you consider this is a file API,

It's not.  It's a stream API.

> and files don't magically get truncated (or mangled) because you
> sought in them.

That is not guaranteed to be true across all operating systems.



More information about the Libc-alpha mailing list