This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: as: How to determine the section of symbols


On Tue, Jul 16, 2019 at 10:43:27AM +0200, Peter Zijlstra wrote:
> On Mon, Jul 15, 2019 at 01:10:42PM -0700, H.J. Lu wrote:
> > On Mon, Jul 15, 2019 at 7:48 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > Hi all,
> > >
> > > I'm trying to 'optimize' the Linux Kernel x86 jump_label support. That
> > > is; jump_label or static_branch() is a Linux Kernel construct build upon
> > > GCC asm goto and provides for self-modifying code based branches.
> > > Regular execution will only see unconditional branches or nops.
> > >
> > > Currently, on x86, we patch between jmp.d32 or nop5 and this works well.
> > >
> > > The quest is to also allow usage of jmp.d8 (and the matching nop2) where
> > > the displacement allows for this.
> > >
> > > The below patch is the last of a series that implements this and
> > > contains all the relevant bits to this discussion, and is subtly broken.
> > >
> > > The problem is that the labels GCC hands to the asm goto () can be in
> > > different sections (namely .text and .text.unlikely), and the GAS manual
> > > sayeth:
> > >
> > >  [ https://sourceware.org/binutils/docs-2.27/as/Infix-Ops.html#Infix-Ops ]
> > >
> > >  - Subtraction. If the right argument is absolute, the result has the
> > >    section of the left argument. If both arguments are in the same
> > >    section, the result is absolute. You may not subtract arguments from
> > >    different sections.
> > >
> > > Funnily this does not result in a compile/assemble time error :-(, it
> > > seems to emit a MOP5 but then at runtime explodes because the actual
> > > displacement (after linking etc..) ends up fitting in a d8 and then the
> > > actual code and the expected code don't match up at code patching time
> > > and we BUG.
> > >
> > > If I were to be able to reliably detect this section mismatch I could
> > > encode it in the JUMP_TABLE_ENTRY (__jump_table section).
> > >
> > > Any clues on how I can (best) fix this; even if it involves writing a
> > > GAS patch that'd be fine, we can have this functionality depend on a
> > > binutils version.
> > >
> > 
> > .d8 is only a hint.  Is that possible to use the new ".nops SIZE" directive
> > where SIZE can be an expression.
> 
> The problem appears to be constructing an expression that yields the
> exact same semantics as jmp. Given that GCC might provide us with a
> label into another section, we cannot (per the above as documentation)
> compute a displacement. Or ever detect this case.

Also, 'funnily' when you add:

	".long disp"

to emit the calculated displacement, you do get an assembly error, but
the (indirect) usage in .skip doesn't trigger this.

This also inhibits emitting the actual jmp instruction the same way we
emit the nop case, which greatly complicates storing the size in our
jump entry table.

With this patchlet on top of the tree I pointed to earlier:

diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
index 663ec7a1f19f..21e0b74d8d5f 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -37,6 +37,9 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran
 		".set is_byte, -res \n\t"
 		".set is_long, -(~res) \n\t"
 
+		".byte 0xe9\n\t"
+		".long disp - 3\n\t"
+
 #ifdef CONFIG_X86_64
 		".skip is_byte, 0x66 \n\t"
 		".skip is_byte, 0x90 \n\t"

(using x86_64-defconfig, gcc (Debian 8.3.0-6) 8.3.0)

$ make O=defconfig-build/ drivers/usb/host/xhci.o
...
/tmp/user/0/cc4sG7aW.s: Assembler messages:
/tmp/user/0/cc4sG7aW.s: Error: invalid operands (.text.unlikely and .text sections) for `-' when setting `disp'

But without that patchlet on top it builds just fine:


Relocation section '.rela__jump_table' at offset 0xec40 contains 63 entries:
Offset          Info           Type           Sym. Value    Sym. Name + Addend
...
000000000130  000200000002 R_X86_64_PC32     0000000000000000 .text + 6284
000000000134  001b00000002 R_X86_64_PC32     0000000000000000 .text.unlikely + b6f
000000000138  013900000018 R_X86_64_PC64     0000000000000000 __tracepoint_xhci_addr + 8
...


.text

0000000000005cb0 <xhci_setup_device>:
...
6284:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
6289:       e9 00 00 00 00          jmpq   628e <xhci_setup_device+0x5de>
		628a: R_X86_64_PC32     .text.unlikely+0xb8d


.text.unlikely

0000000000000b0f <xhci_setup_device.cold.98>:
...
 b6f:   4d 8b 65 08             mov    0x8(%r13),%r12
 b73:   65 8b 05 00 00 00 00    mov    %gs:0x0(%rip),%eax        # b7a <xhci_setup_device.cold.98+0x6b>
		 b76: R_X86_64_PC32      cpu_number-0x4


So even though:

	".set disp, %l[l_yes] - (1b + 2) \n\t"
	".set res, (disp >> 31) == (disp >> 7) \n\t"
	".set is_byte, -res \n\t"
	".set is_long, -(~res) \n\t"

is strictly undefined behaviour per the as documentation, we end up
with is_byte=0 and is_long=1 and emit:

	".skip is_long, 0x0f \n\t"
	".skip is_long, 0x1f \n\t"
	".skip is_long, 0x44 \n\t"
	".skip is_long, 0x00 \n\t"
	".skip is_long, 0x00 \n\t"

as observed in the objdump output, but we cannot, per the assmebler
error earlier, use 'disp'.

Colour me confused and frustrated.


Anyway, this seems to actually work:

	".nops (2*is_byte) + (5*is_long)\n\t"

and for the single case (above) I checked it emits the same nop5.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]