This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: Systemtap kernel backtraces not working on 4.14.14
- From: Florian Weimer <fweimer at redhat dot com>
- To: Avi Kivity <avi at scylladb dot com>, systemtap at sourceware dot org
- Date: Wed, 24 Jan 2018 20:04:39 +0100
- Subject: Re: Systemtap kernel backtraces not working on 4.14.14
- Authentication-results: sourceware.org; auth=none
- References: <4e07a614-b734-8d91-b18d-d6313981c191@scylladb.com>
On 01/24/2018 09:48 AM, Avi Kivity wrote:
Maybe systemtap can't cope with retpolines?
We know that GCC generates incorrect unwind data for retpolines:
https://gcc.gnu.org/ml/gcc/2018-01/msg00160.html
But that report was derived from first principles, and not by observing
a bug.
Now the odd thing here is that retpolines should not leave a trace on
the call stack once they have transferred control to the actual target
function. The incorrect unwind information only shows up temporarily,
prior to the jump. So it doesn't explain why the backtrace ends at
__schedule in your case.
However, the kernel might use a different retpoline thunk. Can you
capture a vmcore or something like that, to obtain the actual machine
code after run-time patching? And perhaps figure out the caller of
__schedule and disassemble that as well?
There were some kernel fixes on master related to retpolines and
instrumentation:
commit c86a32c09f8ced67971a2310e3b0dda4d1749007
Author: Masami Hiramatsu <mhiramat@kernel.org>
Date: Fri Jan 19 01:15:20 2018 +0900
kprobes/x86: Disable optimizing on the function jumps to indirect thunk
commit c1804a236894ecc942da7dc6c5abe209e56cba93
Author: Masami Hiramatsu <mhiramat@kernel.org>
Date: Fri Jan 19 01:14:51 2018 +0900
kprobes/x86: Blacklist indirect thunk functions for kprobes
commit 736e80a4213e9bbce40a7c050337047128b472ac
Author: Masami Hiramatsu <mhiramat@kernel.org>
Date: Fri Jan 19 01:14:21 2018 +0900
retpoline: Introduce start/end markers of indirect thunk
Thanks,
Florian