[PATCH][x86_64] Convert indirect call via GOT to direct when possible
Sriraman Tallam
tmsriram@google.com
Wed Jun 8 22:22:00 GMT 2016
On Mon, Jun 6, 2016 at 1:50 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Tue, May 31, 2016 at 11:02 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Sat, May 28, 2016 at 10:44 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Fri, May 27, 2016 at 3:14 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> On Fri, May 20, 2016 at 1:32 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Fri, May 20, 2016 at 1:27 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> GCC has option -fno-plt which converts all extern calls to indirect
>>>>>> calls via GOT to prevent the linker for generating any PLT stubs.
>>>>>> However, if the function ends up defined in the executable this patch
>>>>>> will convert those indirect calls/jumps to direct. Since the indirect
>>>>>> calls are one byte longer, an extra nop is needed at the beginning.
>>>>>>
>>>>>> Here is a simple example:
>>>>>>
>>>>>> main.c
>>>>>> ---------
>>>>>> extern int foo();
>>>>>> int main() {
>>>>>> return foo();
>>>>>> }
>>>>>>
>>>>>> deffoo.c
>>>>>> -----------
>>>>>> int foo() {
>>>>>> return 0;
>>>>>> }
>>>>>>
>>>>>> $ gcc -fno-plt main.c deffoo.c
>>>>>> $objdump -d a.out
>>>>>>
>>>>>> 0000000000400626 <main>:
>>>>>> ...
>>>>>> 40062a: ff 15 28 14 00 00 callq *0x1428(%rip) #
>>>>>> 401a58 <_DYNAMIC+0x1d8>
>>>>>>
>>>>>> The call is indirect even though foo is defined in the executable.
>>>>>>
>>>>>> With this patch,
>>>>>> 0000000000400606 <main>:
>>>>>> ....
>>>>>> 40060a: 90 nop
>>>>>> 40060b: e8 03 00 00 00 callq 400613 <foo>
>>>>>>
>>>>>> The call is now direct with an extra nop.
>>>>>>
>>>>>>
>>>>>
>>>>> Please try ld, which uses 0x67 prefix (addr32) instead of nop.
>>>>> Also for
>>>>>
>>>>> jmp *foo#GOTPCREL(%rip)
>>>>>
>>>>> ld converts it to
>>>>>
>>>>> jmp foo
>>>>> nop
>>>>
>>>> I have modified the patch to keep it consistent with what ld produces.
>>>>
>>>> Please take another look.
>>>>
>>>> * x86_64.cc (can_convert_callq_to_direct): New function.
>>>> Target_x86_64<size>::Scan::global: Check if an indirect call via
>>>> GOT can be converted to direct.
>>>> Target_x86_64<size>::Relocate::relocate: Change any indirect call
>>>> via GOT that can be converted.
>>>> * testsuite/Makefile.am (x86_64_indirect_call_to_direct.sh): New test.
>>>> * testsuite/Makefile.in: Regenerate.
>>>> * testsuite/x86_64_indirect_call_to_direct1.s: New file.
>>>> * testsuite/x86_64_indirect_jump_to_direct1.s: New file.
>>>>
>>>
>>> Do you need to check R_X86_64_REX_GOTPCRELX for branch?
>>
>> Ok, patch changed to not check for this and refactored a bit.
>
> Ping, Is this patch ok now?
Ping. H.J. / Cary, is this good to go now?
* x86_64.cc (can_convert_callq_to_direct): New function.
Target_x86_64<size>::Scan::global: Check if an indirect call via
GOT can be converted to direct.
Target_x86_64<size>::Relocate::relocate: Change any indirect call
via GOT that can be converted.
* testsuite/Makefile.am (x86_64_indirect_call_to_direct.sh): New test.
* testsuite/Makefile.in: Regenerate.
* testsuite/x86_64_indirect_call_to_direct1.s: New file.
* testsuite/x86_64_indirect_jump_to_direct1.s: New file.
Thanks
Sri
>
>
> * x86_64.cc (can_convert_callq_to_direct): New function.
> Target_x86_64<size>::Scan::global: Check if an indirect call via
> GOT can be converted to direct.
> Target_x86_64<size>::Relocate::relocate: Change any indirect call
> via GOT that can be converted.
> * testsuite/Makefile.am (x86_64_indirect_call_to_direct.sh): New test.
> * testsuite/Makefile.in: Regenerate.
> * testsuite/x86_64_indirect_call_to_direct1.s: New file.
> * testsuite/x86_64_indirect_jump_to_direct1.s: New file.
>
> Patch attached.
>
> Thanks
> Sri
>
>>
>> Thanks
>> Sri
>>
>>>
>>> --
>>> H.J.
-------------- next part --------------
* x86_64.cc (can_convert_callq_to_direct): New function.
Target_x86_64<size>::Scan::global: Check if an indirect call via
GOT can be converted to direct.
Target_x86_64<size>::Relocate::relocate: Change any indirect call
via GOT that can be converted.
* testsuite/Makefile.am (x86_64_indirect_call_to_direct.sh): New test.
* testsuite/Makefile.in: Regenerate.
* testsuite/x86_64_indirect_call_to_direct1.s: New file.
* testsuite/x86_64_indirect_jump_to_direct1.s: New file.
diff --git a/gold/testsuite/Makefile.am b/gold/testsuite/Makefile.am
index 01cae9f..f5cc0db 100644
--- a/gold/testsuite/Makefile.am
+++ b/gold/testsuite/Makefile.am
@@ -1096,6 +1096,25 @@ x86_64_mov_to_lea13.stdout: x86_64_mov_to_lea13
x86_64_mov_to_lea14.stdout: x86_64_mov_to_lea14
$(TEST_OBJDUMP) -dw $< > $@
+check_SCRIPTS += x86_64_indirect_call_to_direct.sh
+check_DATA += x86_64_indirect_call_to_direct1.stdout \
+ x86_64_indirect_jump_to_direct1.stdout
+MOSTLYCLEANFILES += x86_64_indirect_call_to_direct1 \
+ x86_64_indirect_jump_to_direct1
+
+x86_64_indirect_call_to_direct1.o: x86_64_indirect_call_to_direct1.s
+ $(TEST_AS) --64 -mrelax-relocations=yes -o $@ $<
+x86_64_indirect_call_to_direct1: x86_64_indirect_call_to_direct1.o gcctestdir/ld
+ gcctestdir/ld -o $@ $<
+x86_64_indirect_call_to_direct1.stdout: x86_64_indirect_call_to_direct1
+ $(TEST_OBJDUMP) -dw $< > $@
+x86_64_indirect_jump_to_direct1.o: x86_64_indirect_jump_to_direct1.s
+ $(TEST_AS) --64 -mrelax-relocations=yes -o $@ $<
+x86_64_indirect_jump_to_direct1: x86_64_indirect_jump_to_direct1.o gcctestdir/ld
+ gcctestdir/ld -o $@ $<
+x86_64_indirect_jump_to_direct1.stdout: x86_64_indirect_jump_to_direct1
+ $(TEST_OBJDUMP) -dw $< > $@
+
check_SCRIPTS += x86_64_overflow_pc32.sh
check_DATA += x86_64_overflow_pc32.err
MOSTLYCLEANFILES += x86_64_overflow_pc32.err
diff --git a/gold/testsuite/x86_64_indirect_call_to_direct.sh b/gold/testsuite/x86_64_indirect_call_to_direct.sh
index e69de29..d54d024 100755
--- a/gold/testsuite/x86_64_indirect_call_to_direct.sh
+++ b/gold/testsuite/x86_64_indirect_call_to_direct.sh
@@ -0,0 +1,29 @@
+#!/bin/sh
+
+# x86_64_indirect_call_to_direct.sh -- a test for indirect call(jump) to direct
+# conversion.
+
+# Copyright (C) 2016 onwards Free Software Foundation, Inc.
+# Written by Sriraman Tallam <tmsriram@google.com>
+
+# This file is part of gold.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston,
+# MA 02110-1301, USA.
+
+set -e
+
+grep -q "callq[ ]\+[a-f0-9]\+ <foo>" x86_64_indirect_call_to_direct1.stdout
+grep -q "jmpq[ ]\+[a-f0-9]\+ <foo>" x86_64_indirect_jump_to_direct1.stdout
diff --git a/gold/testsuite/x86_64_indirect_call_to_direct1.s b/gold/testsuite/x86_64_indirect_call_to_direct1.s
index e69de29..5ca2e38 100644
--- a/gold/testsuite/x86_64_indirect_call_to_direct1.s
+++ b/gold/testsuite/x86_64_indirect_call_to_direct1.s
@@ -0,0 +1,12 @@
+ .text
+ .globl foo
+ .type foo, @function
+foo:
+ ret
+ .size foo, .-foo
+ .globl main
+ .type main, @function
+main:
+ call *foo@GOTPCREL(%rip)
+ ret
+ .size main, .-main
diff --git a/gold/testsuite/x86_64_indirect_jump_to_direct1.s b/gold/testsuite/x86_64_indirect_jump_to_direct1.s
index e69de29..b817e34 100644
--- a/gold/testsuite/x86_64_indirect_jump_to_direct1.s
+++ b/gold/testsuite/x86_64_indirect_jump_to_direct1.s
@@ -0,0 +1,11 @@
+ .text
+ .globl foo
+ .type foo, @function
+foo:
+ ret
+ .size foo, .-foo
+ .globl main
+ .type main, @function
+main:
+ jmp *foo@GOTPCREL(%rip)
+ .size main, .-main
diff --git a/gold/x86_64.cc b/gold/x86_64.cc
index 81126ef..d774d5b 100644
--- a/gold/x86_64.cc
+++ b/gold/x86_64.cc
@@ -891,6 +891,22 @@ class Target_x86_64 : public Sized_target<size, false>
&& strcmp(gsym->name(), "_DYNAMIC") != 0);
}
+ // Convert
+ // callq *foo@GOTPCRELX(%rip) to
+ // addr32 callq foo
+ // and jmpq *foo@GOTPCRELX(%rip) to
+ // jmpq foo
+ // nop
+ static bool
+ can_convert_callq_to_direct(const Symbol* gsym)
+ {
+ gold_assert(gsym != NULL);
+ return (gsym->type() == elfcpp::STT_FUNC
+ && !gsym->is_undefined ()
+ && !gsym->is_from_dynobj()
+ && !gsym->is_preemptible());
+ }
+
// Adjust TLS relocation type based on the options and whether this
// is a local symbol.
static tls::Tls_optimization
@@ -2931,17 +2947,34 @@ Target_x86_64<size>::Scan::global(Symbol_table* symtab,
// If we convert this from
// mov foo@GOTPCREL(%rip), %reg
// to lea foo(%rip), %reg.
+ // OR
+ // if we convert
+ // (callq|jmpq) *foo@GOTPCRELX(%rip) to
+ // (callq|jmpq) foo
// in Relocate::relocate, then there is nothing to do here.
- if ((r_type == elfcpp::R_X86_64_GOTPCREL
- || r_type == elfcpp::R_X86_64_GOTPCRELX
- || r_type == elfcpp::R_X86_64_REX_GOTPCRELX)
- && reloc.get_r_offset() >= 2
- && Target_x86_64<size>::can_convert_mov_to_lea(gsym))
+ bool do_convert_mov_to_lea
+ = ((r_type == elfcpp::R_X86_64_GOTPCREL
+ || r_type == elfcpp::R_X86_64_GOTPCRELX
+ || r_type == elfcpp::R_X86_64_REX_GOTPCRELX)
+ && reloc.get_r_offset() >= 2
+ && Target_x86_64<size>::can_convert_mov_to_lea(gsym));
+ bool do_convert_callq_to_direct
+ = (r_type == elfcpp::R_X86_64_GOTPCRELX
+ && reloc.get_r_offset() >= 2
+ && Target_x86_64<size>::can_convert_callq_to_direct(gsym));
+ if (do_convert_mov_to_lea || do_convert_callq_to_direct)
{
section_size_type stype;
const unsigned char* view = object->section_contents(data_shndx,
&stype, true);
- if (view[reloc.get_r_offset() - 2] == 0x8b)
+ if (do_convert_mov_to_lea
+ && view[reloc.get_r_offset() - 2] == 0x8b)
+ break;
+
+ if (do_convert_callq_to_direct
+ && view[reloc.get_r_offset() - 2] == 0xff
+ && (view[reloc.get_r_offset() - 1] == 0x15
+ || view[reloc.get_r_offset() - 1] == 0x25))
break;
}
@@ -3634,6 +3667,45 @@ Target_x86_64<size>::Relocate::relocate(
view[-2] = 0x8d;
Reloc_funcs::pcrela32(view, object, psymval, addend, address);
}
+ // Convert
+ // callq *foo@GOTPCRELX(%rip) to
+ // addr32 callq foo
+ // and jmpq *foo@GOTPCRELX(%rip) to
+ // jmpq foo
+ // nop
+ else if (r_type == elfcpp::R_X86_64_GOTPCRELX
+ && rela.get_r_offset() >= 2
+ && view[-2] == 0xff
+ && (view [-1] == 0x15 || view [-1] == 0x25)
+ && (gsym != NULL
+ && Target_x86_64<size>::can_convert_callq_to_direct(gsym)))
+ {
+ if (view[-1] == 0x15)
+ {
+ // Convert callq *foo@GOTPCRELX(%rip) to addr32 callq.
+ // Opcode of addr32 is 0x67 and opcode of direct callq is 0xe8.
+ view[-2] = 0x67;
+ view[-1] = 0xe8;
+ // Convert GOTPCRELX to 32-bit pc relative reloc.
+ Reloc_funcs::pcrela32(view, object, psymval, addend, address);
+ }
+ else
+ {
+ // Convert jmpq *foo@GOTPCRELX(%rip) to
+ // jmpq foo
+ // nop
+ // The opcode of direct jmpq is 0xe9.
+ view[-2] = 0xe9;
+ // The opcode of nop is 0x90.
+ view[3] = 0x90;
+ // Convert GOTPCRELX to 32-bit pc relative reloc. jmpq is rip
+ // relative and since the instruction following the jmpq is now
+ // the nop, offset the address by 1 byte. The start of the
+ // relocation also moves ahead by 1 byte.
+ Reloc_funcs::pcrela32(&view[-1], object, psymval, addend,
+ address - 1);
+ }
+ }
else
{
if (gsym != NULL)
More information about the Binutils
mailing list