This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
Possibly a bug in glibc around the getrandom(2) implementation.
- From: Marcin Mielniczuk <marmistrz dot dev at zoho dot eu>
- To: libc-help at sourceware dot org
- Date: Fri, 14 Jul 2017 15:11:18 +0200
- Subject: Possibly a bug in glibc around the getrandom(2) implementation.
- Authentication-results: sourceware.org; auth=none
Hi!
While developing a ptrace-based utility I came across really weird
things happening over there.
Much research on this topic led me to believe that what I was seeing is
a bug in either glibc or the compiler.
I'm starting here, since glibc is a little higher-level.
First of all, I'm eager to trace this one deeper if you guide me. This
would be basically part of the project I'm working at and I'm certainly
willing to keep hunting the bug.
Short statement of a problem: a Python script run by the CPython
interpreter smashes stack, if it's traced using ptrace. More details below.
I developed a small utility to trace and intercept the getrandom(2)
syscall. It's original implementation is in Rust but for the sake of
debugging I rewrote the code to C - it's the official ptrace API,
nevertheless.
The C utility [1] invokes a Python script [2]. If the Python script is
invoked standalone (without the tracing program), it correctly runs to
the end. The same happens, if the shebang is changed to
#!/usr/bin/python
i.e. the executable is specified directly. If `env` is used, on the
other hand, this exits with a spectacular error message about the stack
being smashed. [3]
What's even better - it happens only if env is the direct child of the
tracer, e.g. if the tracer execs `valgrind ./pi.py`, everything works
perfectly (remember that ./pi.py will invoke env!),
if the tracer execs `env valgrind ./pi.py` - everything explodes.
The stack is being smashed, indeed! If I patch CPython, so that it
prints the address of the stack variable that the process is assigned -
that's exactly the same as my utility prints, and analyzing the
memory map shows that it's only partially contained in stack. So the
stack protector does the right job.
Moreover, if I patch getrandom(2) using the LD_PRELOAD trick (create a
non-static function with the same signature and LD_PRELOAD it)
everything works correct.
Even if I call the getrandom syscall in my function using syscall(2) -
everything works perfectly. I.e. eliminating the glibc getrandom(2) is a
workaround.
I was suspecting a CPython bug and my research makes me believe it's
not. The getrandom(2) syscall is invoked from the random_seed_urandom
[4] function.
It declares a stack variable which is passed through the
_PyOS_URandomNonblock, [5] pyurandom [6] and py_getrandom [7] to the
getrandom(2) glibc wrapper [8].
There's basically nothing suspicious in this code. But the whole magic
happens when I tried to add some logging on the CPython side.
Adding any printf statement anywhere after issuing getrandom resulted
in... an immediate fix! Suddenly, everything started to work as expected.
I have a hypothesis with no evidence that for some reason the stack
space is freed too early but a read succeeds, since it doesn't modify
the values left by the stack protector.
If I rebuild CPython with clang instead of gcc, getrandom(2) returns -1,
indicating an error. Unfortunately, I didn't manage to find a way to
peek errno from the traced process.
Everything I described happened even on an unoptimized build of CPython
(-O0).
Another thing I noticed, that if the execution chain is tracer ->
python, the RDI and RSI registers, containing the syscall arguments
remain valid until the syscall exit.
If tracer -> env -> python is used, on the other hand, the arguments are
invalidated in the chain of execution and the registers contain
something else.
My environment: Arch Linux, kernel 4.9.36-1-lts, glibc 2.25, Python
3.6.1 (and 3.7.0 from git).
The same error was reproduce on a similar setup but with kernel
4.11.5-1-ARCH.
Do you have any ideas, what to do next?
Regards,
Marcin Mielniczuk
[1] https://gist.github.com/marmistrz/56eac71d3cb65fb22caa5de1c95300e3
[2]https://gist.github.com/marmistrz/787858bcc72884aff1cf881f45b8e962
[3] https://gist.github.com/marmistrz/5a26cecf438b592afcd2ce950609cba0
[4]
https://github.com/python/cpython/blob/master/Modules/_randommodule.c#L204
[5]
https://github.com/python/cpython/blob/master/Python/bootstrap_hash.c#L531
[6]
https://github.com/python/cpython/blob/master/Python/bootstrap_hash.c#L465
[7]
https://github.com/python/cpython/blob/master/Python/bootstrap_hash.c#L98
[8]
https://github.com/python/cpython/blob/master/Python/bootstrap_hash.c#L125