This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RFC: Unwind info for PLT


Hi!

Sorry for the wide posting, but this is something that isn't limited
to one particular component.
Frank yesterday raised on IRC a problem that PLT doesn't have unwind info.
So, when a debugger or systemtap stops in PLT or if an async signal
is sent to the process while inside of PLT and e.g. calls backtrace or
attempts to unwind through, it will stop in the PLT slot.
Example testcase:
cat > lib1.c <<\EOF
#define A(n) void n (void) {}
#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
#define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9)
E(f)
EOF
cat > prg1.c <<\EOF
#include <stddef.h>
#include <execinfo.h>
#include <signal.h>
#define A(n) extern void n (void);
#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
#define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9)
E(f)
static void
handler (int sig, siginfo_t *info, void *ctx)
{
  void *buf[256];
  int n = backtrace (buf, 256);
  backtrace_symbols_fd (buf, n, 2);
}
int
main ()
{
  struct sigaction s;
  sigemptyset (&s.sa_mask);
  s.sa_sigaction = handler;
  s.sa_flags = SA_RESETHAND | SA_ONSTACK;
  sigaction (SIGUSR1, &s, NULL);
#undef A
#define A(n) n ();
E(f)
  return 0;
}
EOF
gcc -shared -fpic -O2 -o lib1.so lib1.c -g
gcc -O2 -g -o prg1 prg1.c ./lib1.so

gdb apparently has some code to default to a certain unwind info if no FDE is found,
as b 'f0001@plt'; c; bt; works and stepi; bt; works too, but next stepi; bt; already gives
#0  0x0000000000494f73 in f0001@plt ()
#1  0x0000000000000be7 in ?? ()
#2  0x00000000004b03d8 in main () at prg1.c:29
instead of
#0  0x0000000000494f73 in f0001@plt ()
#1  0x00000000004b03d8 in main () at prg1.c:29
But when doing b 'f0002@plt'; c; then kill -USR1 `pidof prg1` from another terminal
and then continuing, the backtrace shown stops in the plt and doesn't go back.

I wonder whether ld couldn't synthetize unwind info for the .plt section
(perhaps with some option), or alternatively if it couldn't just provide
hidden __PLT_START__/__PLT_END__ or similar symbols and the unwind info couldn't
be written in glibc crtfiles linked into it.

For x86_64 which has .plt like:
Disassembly of section .plt:

00000035e7a1ec50 <calloc@plt-0x10>:
  35e7a1ec50:   ff 35 9a 33 37 00       pushq  0x37339a(%rip)        # 35e7d91ff0 <_GLOBAL_OFFSET_TABLE_+0x8>
  35e7a1ec56:   ff 25 9c 33 37 00       jmpq   *0x37339c(%rip)        # 35e7d91ff8 <_GLOBAL_OFFSET_TABLE_+0x10>
  35e7a1ec5c:   0f 1f 40 00             nopl   0x0(%rax)

00000035e7a1ec60 <calloc@plt>:
  35e7a1ec60:   ff 25 9a 33 37 00       jmpq   *0x37339a(%rip)        # 35e7d92000 <_GLOBAL_OFFSET_TABLE_+0x18>
  35e7a1ec66:   68 00 00 00 00          pushq  $0x0
  35e7a1ec6b:   e9 e0 ff ff ff          jmpq   35e7a1ec50 <data.9331+0x35e7a1ebc8>

00000035e7a1ec70 <realloc@plt>:
  35e7a1ec70:   ff 25 92 33 37 00       jmpq   *0x373392(%rip)        # 35e7d92008 <_GLOBAL_OFFSET_TABLE_+0x20>
  35e7a1ec76:   68 01 00 00 00          pushq  $0x1
  35e7a1ec7b:   e9 d0 ff ff ff          jmpq   35e7a1ec50 <data.9331+0x35e7a1ebc8>
...
I think unwind info could look roughly like untested following
if we raised .plt alignment to 16 bytes instead of 4,
i.e. have explicit unwind info for the first 16 bytes
and then for the rest compute CFA expression as
%rsp + 8 + ((%rip & 15) >= 11) * 8

00000000 00000014 00000000 CIE
  Version:               1
  Augmentation:          "zR"
  Code alignment factor: 1
  Data alignment factor: -8
  Return address column: 16
  Augmentation data:     1b

  DW_CFA_def_cfa: r7 (rsp) ofs 8
  DW_CFA_offset: r16 (rip) at cfa-8
  DW_CFA_nop
  DW_CFA_nop

00000018 00000024 0000001c FDE cie=00000000 pc=__PLT_START__..__PLT_END__
  DW_CFA_def_cfa_offset: 16
  DW_CFA_advance_loc: 6 to __PLT_START__+6
  DW_CFA_def_cfa_offset: 24
  DW_CFA_advance_loc: 10 to __PLT_START__+16
  DW_CFA_def_cfa_expression: (DW_OP_breg7 (rsp): 8; DW_OP_breg16 (rip): 0; DW_OP_lit15; DW_OP_and; DW_OP_lit11; DW_OP_ge; DW_OP_lit3; DW_OP_shl; DW_OP_plus;)
  DW_OP_nop
  DW_OP_nop
  DW_OP_nop
  DW_OP_nop

If .plt is kept just 4 bytes aligned it would be slightly harder.
If we could rely on %rip being just xxx@plt, xxx@plt+6 or xxx@plt+11,
we could just dereference the byte at %rip and decide based on it,
somathing like
%rsp + 8 + (*(char*)%rip == 0xe9) * 8

  DW_CFA_def_cfa_expression: (DW_OP_breg7 (rsp): 8 DW_OP_breg16 (rip): 0; DW_OP_deref_size: 1; DW_OP_const1u: 233; DW_OP_eq; DW_OP_lit3; DW_OP_shl; DW_OP_plus;)

but can we rely on it (i.e. won't something give us %rip-1 instead)?
I think for at least the libgcc unwinder it shouldn't, because the
only way to see a PLT slot is through an async signal handler that stops
inside of PLT and the sigreturn pad should be using S.

Anyway, I think hardcoding this in the linker would be problematic,
we couldn't tweak it, so providing some special hidden symbols around
the .plt section and let glibc crtfiles provide it sounds like the best
option to me.
What do you think?  For other architectures of course the unwind info would
need to be different, i?86 32-bit would look quite similar if we use the
insn decoding version, as there the only jump after one push (disregarding
first 16 bytes of .plt section which would be explicit) has also 0xe9 opcode,
just the register numbers would be different, offset 4 instead of 8 and
shl by 2 instead of 3.

BTW, the unwind info for _dl_runtime_resolve on x86_64 is wrong on
the first instruction as well:
(gdb) bt
#0  0x00000035e7613840 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2
#1  0x00000035e78202a8 in _r_debug ()
#2  0x0000000000000be7 in ?? ()
#3  0x00000000004b03d8 in main () at prg1.c:29
(gdb) stepi
0x00000035e7613844 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2
(gdb) bt
#0  0x00000035e7613844 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2
#1  0x00000000004b03d8 in main () at prg1.c:29

Untested fix below:

2011-06-10  Jakub Jelinek  <jakub@redhat.com>

	* sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve): Fix CFA
	offset on the first insn.

--- libc/sysdeps/x86_64/dl-trampoline.S.jj	2009-08-29 03:34:52.000000000 +0200
+++ libc/sysdeps/x86_64/dl-trampoline.S	2011-06-10 09:34:16.000000000 +0200
@@ -1,5 +1,5 @@
 /* PLT trampolines.  x86-64 version.
-   Copyright (C) 2004, 2005, 2007, 2009 Free Software Foundation, Inc.
+   Copyright (C) 2004, 2005, 2007, 2009, 2011 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -27,8 +27,9 @@
 	.align 16
 	cfi_startproc
 _dl_runtime_resolve:
+	cfi_adjust_cfa_offset(16) # Incorporate PLT
 	subq $56,%rsp
-	cfi_adjust_cfa_offset(72) # Incorporate PLT
+	cfi_adjust_cfa_offset(56)
 	movq %rax,(%rsp)	# Preserve registers otherwise clobbered.
 	movq %rcx, 8(%rsp)
 	movq %rdx, 16(%rsp)

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]