I was trying today to filter my access.log apache log with some coreutils
and was annoyed by the default buffering applied by glibc.
I was trying to do `tail -f ~/access.log | cut ... | uniq` but I was
only getting output when cut had more than 4K written to stdout.
So how to control this? Well each app could add an extra config
parameter (see grep --line-buffered for example), but this doesn't
seem general, and just requires duplicating both logic and documentation
for each application. What would be ideal IMHO would be to
add the config logic in glibc (which would have to be controlled
with environment variables). There seems to be resitance to that though:
Anyway whether it's implemented in libc or the application (coreutils lib),
I think they should have the same config interface which would
be environment variables with something like the following format:
Where X = the fd number
and Y = 0 for unbuffered, 1 for line buffered and >1 for a specific
So for my particular problem I could do:
tail -f ~/access.log | BUF_1_=1 cut ... | uniq
Hell, no. Programs expect a certain buffer mode and perhaps would work
unexpectedly if this changes. By setting a mode to unbuffered, for instance,
you can easily DoS a system. I can think about enough other reasons why this is
a terrible idea. Programs explicitly must request a buffering scheme so that it
matches the way the program uses the stream.
According to https://blog.plover.com/Unix/stdio-buffering-2.html , this feature was in fact added to NetBSD (see also https://mail-index.netbsd.org/tech-userlevel/2015/07/14/msg009247.html et seq) and has not caused problems there. I think we should reconsider. Yes, a program (or an entire process tree) could be forced to suffer much greater I/O overhead using this feature, but I don't think it rises to the level of "denial of service", there are other ways for a determined Mallory to get the same effect, and there are obvious positive use cases for e.g. manually overriding the output of a long-running program to be line-buffered even if it's going to a pipe.
I agree that this is a debugging facility (not unlike MALLOC_CHECK_), that can be very handy at times.