There are at least two implementation bugs in the random_r class of functions.
First, the random_data structure, being opaque, is typically just allocated on
the stack or malloc()ed. There is no way to create a valid "this is not an old
state vector" structure without knowing the implementation details of the
structure. If the compiler does not initialize the "state" member of the
structure to a value equal to the NULL pointer, the initstate_r() function will
crash as it dereferences whatever non-NULL but invalid value is held there. We
need a method to create a blank random_data object, or the documentation has to
point out that the user is responsible for initializing the "state" member to NULL.
Second, initstate_r() and setstate_r() are documented in their comment blocks as
returning a pointer to the old state, but they do not. There is no documented
way to retrieve the old state in a fashion that allows it to be re-introduced
into the system with setstate_r() at a later time. As such, the setstate_r()
function is essentially unusable.
If you want to see documentation changes you do the work.
Created attachment 1448 [details]
Add documentation necessary to use initstate_r() and setstate_r()
This fixes the incorrect comments in the .c file and adds necessary details to
the .texi file allowing the user to make use of initstate_r() and setstate_r(),
assuming that the implementation of struct random_data remains stable.
I hit this bug today. Can the patch be deployed?
(What does the SUSPENDED but status mean?)
The user should always initialize the entire structure with NUL bytes. No need
to know the members of the structure, just the size is needed. It shouldn't be
documented that looking at the members is needed.
If you update the patch change the state of the bug. I don't see suspended bugs
in my queries.
I have applied the patch to random_r.c.
Changing the comments isn't really going to fix the issue. The problem is that
the random_r class of functions are entirely broken to people who do not know
the implementation details of the random_data structure.
For instance, without knowing the innards of random_data, it's impossible to
call srandom_r() without getting a segmentation violation (try it!). The
random_data structure cannot be just set to all bytes zero, because the "state"
member of it has to point to a valid external buffer. srandom_r() can't malloc
this itself because there's no destructor-equivalent to release the data when
we're finished (performing the role that regfree() does in the regex code).
So, without srandom_r(), all you're left with is initstate_r(). This function
will work if you hand it a random_data structure set to all zeroes, but what the
documentation doesn't tell you is that it then modifies something inside
random_data to point to the buffer you passed as state. This is deadly, it
means if the buffer you created to hold the incoming state was an auto variable,
the random_r functions will be using released stack space as their internal
scratch space, merrily stomping on stack memory every time the user asks for
another random number.
So, without knowing the internal details of random_data, the only way to start
up the random number generator is with initstate_r(), in which case it is
critically important (but undocumented) that the second argument be of static
duration and valid throughout the life of the program. Even then, there is no
documented way to call setstate_r() correctly.
I would suggest that the random_data structure should be redesigned. Rather
than "state" being a pointer to an external entity, the entire state vector
should be stored within random_data. In this way, you can safely manipulate it
without wondering whether the memory it points to is valid.
I can supply a patch to do this, if you agree that this is the way to proceed.
(In reply to comment #5)
> I would suggest that the random_data structure should be redesigned.
I already said no. The interfaces are used internally and that's really their
only purpose. Design and implement your own library with randomization
functions if you must. If you don't want to provide any more patches to the
documentation the bug might as well be closed.
So, if I understand your objection, you feel that the random_r class of
functions is used for the internal implementation details of the non-reentrant
functions. You feel that the internal details of random_data should not be
exposed to the users, and that only changes to documentation be used to address
the issues brought up in this bug report.
I am not trying to be difficult here, I'm trying to help. I will write and
attach a patch that does the following:
1) Remove the documentation for srandom_r() because without an understanding of
the internal implementation of random_data there is literally no way to call
this function without segfaulting. Instead I will note that this function is
for internal use only and must not be called by the user.
2) Document that initstate_r() must be given a static-duration buffer for its
second argument. Also document that the previous value of the state information
is not returned by this function, as is currently claimed.
3) Document the way setstate_r() might be usable in the context of
initstate_r(), based on my reading of the source code.
Note that this patch will not fix the random_r.3 manpage, which will continue to
claim that srandom_r() is a usable function.
Here is what the patch will NOT do, but I'd like to record these thoughts
alongside this bug for future reference. I suggest that if the random_data
structure were augmented with an array of length 256 appended to the end,
correctly aligned for integer access, that the srandom_r() function could be
rescued in a backward-compatible way by making 'state' point to this new
internal buffer. The initstate_r() function could also be fixed up, still
backwardly compatible with older binaries, with the old state being returned in
the passed buffer. Further, a random_data structure would now be a
self-contained object that could be written to disk and loaded at a later time,
unlike the current condition where the structure contains a pointer that points
outside itself, to a block of unknown length.
Created attachment 4415 [details]
Change documentation to reflect realities of the random_r() group of functions
I've made the changes suggested in my previous comment. I also came to the
conclusion that setstate_r() isn't usable without a way to get a value back
from initstate_r(), so I've documented that function as also being for
glibc is misspelled "glic" in the patch.
Through a long and painful process of debugging and hair-pulling I also arrived at this issue.
I am now in a state of weary disbelief.
> I already said no. The interfaces are used internally and that's really
> their only purpose. Design and implement your own library with randomization
> functions if you must. If you don't want to provide any more patches to
> the documentation the bug might as well be closed.
WTF are you talking about? These functions are exposed and documented as normal library functions just like all the others:
random_r, srandom_r, initstate_r, setstate_r - reentrant
random number generator
These functions are the reentrant equivalents of the functions
described in random(3). They are suitable for use in multithreaded
programs where each thread needs to obtain an independent,
reproducible sequence of random numbers.
This is a malevolent segfault booby-trap!
I can't believe the maintainers of glibc don't care.