This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] x86_64: Expand CFI to cover clone after the syscall
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: libc-alpha at sourceware dot org
- Date: Wed, 14 Dec 2016 17:56:00 -0200
- Subject: Re: [PATCH] x86_64: Expand CFI to cover clone after the syscall
- Authentication-results: sourceware.org; auth=none
- References: <20161212052909.GA11750@juliacomputing.com>
On 12/12/2016 03:29, Keno Fischer wrote:
> The CFI used to terminate after the syscall, leaving unwinding for the
> rest of the function up to the debugger's imagination. The comment states
> that the reason for this is that the unwind info would be incorrect in the
> child. However, without unwind info, both the child and the parent process
> may fail to unwind properly after the syscall (though some debuggers still
> have heuristics that work for the parent case). To improve this situation,
> use a DWARF impression that checks %rax, and if non-zero (i.e. we're in
> the parent, behaves like the CFI for the rest of the function). If %rax==0,
> the CFI indicates to unwind to thread_start. Ideally, I would have liked
> to have it undefine %rip, but it does not look like that is possible.
> However, even if not entirely correct, I think this version is a strict
> improvement over what was there before.
>
> E.g. in GDB:
> Backtrace before:
> #0 0x00007f14a1119081 in clone () from /lib/x86_64-linux-gnu/libc.so.6
> #1 0x00007f14a13df640 in ?? () from /lib/x86_64-linux-gnu/libpthread.so.0
> #2 0x00007f14a0e0c700 in ?? ()
> #3 0x0000000000000000 in ?? ()
>
> Backtrace after:
> #0 clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
> #1 0x00007f377330485b in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:120
>
> Another option would be to create another global symbol e.g.
> `new_thread_from_clone` and have the unwind pretend to unwind there.
> ---
> I'm on a somewhat long-term quest here to improve the accuracy of unwind info
> in the basic libraries and the toolchain. Right now we're in a situation where
> e.g. invalid memory accesses by unwinders are often ignored, because there are
> sufficiently many places without unwind info, or with incorrect unwind info that
> it's more likely that than an unwinder bug. I'd like to get into a position I can
> consider an invalid memory access during unwinding a bug, but first, I need to
> cleanup some of the more common cases where one runs into invalid unwind info.
I tried to create a testcase [1] where unwind using backtrace() would trigger
this issue you pointed out, but even by calling clone direct libgcc unwind
(which for x86_64 backtrace is based) at least seems to get this right. It
could be case where this issue appear only with a more complex stackframe
mixing symbols from different modules and DSO (I am far from a CFI expert).
In any case I would like to have at least have a workable testcase to actually
check this fix and have a way to check if this faulty behaviour also triggers
in other architectures than x86_64. I am not really sure how would be a correct
way to check, maybe implementing a very simple backtrack clone using libgcc
as most of the architecture already does.
[1] https://paste.ubuntu.com/23630310/
>
> sysdeps/unix/sysv/linux/x86_64/clone.S | 30 ++++++++++++++++++++++++++----
> 1 file changed, 26 insertions(+), 4 deletions(-)
>
> diff --git a/sysdeps/unix/sysv/linux/x86_64/clone.S b/sysdeps/unix/sysv/linux/x86_64/clone.S
> index 66f4b11..1bd2eed 100644
> --- a/sysdeps/unix/sysv/linux/x86_64/clone.S
> +++ b/sysdeps/unix/sysv/linux/x86_64/clone.S
> @@ -72,16 +72,38 @@ ENTRY (__clone)
> mov 8(%rsp), %R10_LP
> movl $SYS_ify(clone),%eax
>
> - /* End FDE now, because in the child the unwind info will be
> - wrong. */
> - cfi_endproc;
> syscall
>
> + /* Best effort unwind info that works for both the parent and the child.
> + Ideally, we'd have cfi_undefiend(%rip) in the child and keep everything
> + the same in the parent, but we can't do that (since there's no way for
> + an expression to return undefined). Instead, we pretend that the child
> + came from thread_start. This isn't quite correct, but at least better than
> + whatever the debugger heuristics would have come up with in the absence
> + of unwind info */
> +
> + /* Encodes: parent %rip = %rax == 0 ? %rip + (2f - .) : *%rsp
> + DW_CFA_val_expression %rip DW_OP_breg16(2f-1b) DW_OP_lit0 DW_OP_breg0(0)
> + DW_OP_eq DW_OP_bra(0x0003) DW_OP_breg7(0) DW_OP_deref */
> +#define CFI_CHILD_RIP_IS_THREAD_START \
> + .cfi_escape 0x16, 0x10, 0xc, 0x80, 2f-., 0x30, 0x70, 0x0, 0x29,\
> + 0x28, 0x03, 0x0, 0x77, 0x0, 0x6;
> +
> + /* encode parent %rsp = %rsp + (%rax != 0 ? 8 : 0)
> + DW_CFA_val_expression %rsp DW_OP_breg7(0) DW_OP_lit0 DW_OP_breg0(0)
> + DW_OP_ne DW_OP_lit3 DW_OP_shl DW_OP_plus*/
> + .cfi_escape 0x16, 0x7, 0x9, 0x77, 0x0, 0x30, 0x70, 0x00, 0x2e, 0x33,
> + 0x24, 0x22;
> +
> + CFI_CHILD_RIP_IS_THREAD_START
> testq %rax,%rax
> + CFI_CHILD_RIP_IS_THREAD_START
> jl SYSCALL_ERROR_LABEL
> + CFI_CHILD_RIP_IS_THREAD_START
> jz L(thread_start)
>
> ret
> + cfi_endproc;
>
> L(thread_start):
> cfi_startproc;
> @@ -90,7 +112,7 @@ L(thread_start):
> /* Clear the frame pointer. The ABI suggests this be done, to mark
> the outermost frame obviously. */
> xorl %ebp, %ebp
> -
> +2:
> andq $CLONE_VM, %rdi
> jne 1f
> movl $SYS_ify(getpid), %eax
>