RFE: enable buffering on null-terminated data

Zachary Santer zsanter@gmail.com
Tue Mar 12 03:34:39 GMT 2024


On Mon, Mar 11, 2024 at 7:54 AM Carl Edquist <edquist@cs.wisc.edu> wrote:
>
> (In my coprocess management library, I effectively run every coproc with
> --output=L by default, by eval'ing the output of 'env -i stdbuf -oL env',
> because most of the time for a coprocess, that's whats wanted/necessary.)

Surrounded by 'set -a' and 'set +a', I guess? Now that's interesting.
I just added that to a script I have that prints lines output by
another command that it runs, generally a build script, to the command
line, but updating the same line over and over again. I want to see if
it updates more continuously like that.

> ... Although, for your example coprocess use, where the shell both
> produces the input for the coproc and consumes its output, you might be
> able to simplify things by making the producer and consumer separate
> processes.  Then you could do a simpler 'producer | filter | consumer'
> without having to worry about buffering at all.  But if the producer and
> consumer need to be in the same process (eg they share state and are
> logically interdependent), then yeah that's where you need a coprocess for
> the filter.

Yeah, there's really no way to break what I'm doing into a standard pipeline.

> (Although given your time output, you might say the performance hit for
> unbuffered is not that huge.)

We see a somewhat bigger difference, at least proportionally, if we
get bash more or less out of the way. See command-buffering, attached.

Standard:
real    0m0.202s
user    0m0.280s
sys     0m0.076s
Line-buffered:
real    0m0.497s
user    0m0.374s
sys     0m0.545s
Unbuffered:
real    0m0.648s
user    0m0.544s
sys     0m0.702s

In coproc-buffering, unbuffered output was 21.7% slower than
line-buffered output, whereas here it's 30.4% slower.

Of course, using line-buffered or unbuffered output in this situation
makes no sense. Where it might be useful in a pipeline is when an
earlier command in a pipeline might only print things occasionally,
and you want those things transformed and printed to the command line
immediately.

> So ... again in theory I also feel like a null-terminated buffering mode
> for stdbuf(1) (and setbuf(3)) is kind of a missing feature.

My assumption is that line-buffering through setbuf(3) was implemented
for printing to the command line, so its availability to stdbuf(1) is
just a useful side effect.

In the BUGS section in the man page for stdbuf(1), we see:
On GLIBC platforms, specifying a buffer size, i.e., using fully
buffered mode will result in undefined operation.

If I'm not mistaken, then buffer modes other than 0 and L don't
actually work. Maybe I should count my blessings here. I don't know
what's going on in the background that would explain glibc not
supporting any of that, or stdbuf(1) implementing features that aren't
supported on the vast majority of systems where it will be installed.

> It may just
> be that nobody has actually had a real need for it.  (Yet?)

I imagine if anybody has, they just set --output=0 and moved on. Bash
scripts aren't the fastest thing in the world, anyway.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: command-buffering
Type: application/octet-stream
Size: 887 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/libc-alpha/attachments/20240311/f2e9faab/attachment.obj>


More information about the Libc-alpha mailing list