arc4random - are you sure we want these?
Jason A. Donenfeld
Jason@zx2c4.com
Sat Jul 23 16:25:21 GMT 2022
[Resending to right address.]
Hi glibc developers,
I learned about the addition of the arc4random functions in glibc this
morning, thanks to Phoronix. I wish somebody would have CC'd me into
those discussions before it got committed, but here we are.
I really wonder whether this is a good idea, whether this is something
that glibc wants, and whether it's a design worth committing to in the
long term.
Firstly, for what use cases does this actually help? As of recent
changes to the Linux kernels -- now backported all the way to 4.9! --
getrandom() and /dev/urandom are extremely fast and operate over per-cpu
states locklessly. Sure you avoid a syscall by doing that in userspace,
but does it really matter? Who exactly benefits from this?
Seen that way, it seems like a lot of complexity for nothing, and
complexity that will lead to bugs and various oversights eventually.
For example, the kernel reseeds itself when virtual machines fork using
an identifier passed to the kernel via ACPI. It also reseeds itself on
system resume, both from ordinary S3 sleep but also, more importantly,
from hibernation. And in general, being the arbiter of entropy, the
kernel is much better poised to determine when it makes sense to reseed.
Glibc, on the other hand, can employ some heuristics and make some
decisions -- on fork, after 16 MiB, and the like -- but in general these
are lacking, compared to the much wider array of information the kernel
has.
You miss out on this with arc4random, and if that information _is_ to be
exported to userspace somehow in the future, it would be awfully nice to
design the userspace interface alongside the kernel one.
For that reason, past discussion of having some random number generation
in userspace libcs has geared toward doing this in the vDSO, somehow,
where the kernel can be part and parcel of that effort.
Seen from this perspective, going with OpenBSD's older paradigm might be
rather limiting. Why not work together, between the kernel and libc, to
see if we can come up with something better, before settling on an
interface with semantics that are hard to walk back later?
As-is, it's hard to recommend that anybody really use these functions.
Just keep using getrandom(2), which has mostly favorable semantics.
Yes, I get it: it's fun to make a random number generator, and so lots
of projects figure out some way to make yet another one somewhere
somehow. But the tendency to do so feels like a weird computer tinkerer
disease rather something that has ever helped the overall ecosystem.
So I'm wondering: who actually needs this, and why? What's the
performance requirement like, and why is getrandom(2) insufficient? And
is this really the best approach to take? If this is something needed,
how would you feel about working together on a vDSO approach instead? Or
maybe nobody actually needs this in the first place?
And secondly, is there anyway that glibc can *not* do this, or has that
ship fully sailed, and I really missed out by not being part of that
discussion whenever it was happening?
Thanks,
Jason
More information about the Libc-alpha
mailing list