This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][x86_64] Convert indirect call via GOT to direct when possible


On Fri, Jun 10, 2016 at 1:06 PM, Cary Coutant <ccoutant@gmail.com> wrote:
>>>> * x86_64.cc (can_convert_callq_to_direct): New function.
>>>> Target_x86_64<size>::Scan::global: Check if an indirect call via
>>>> GOT can be converted to direct.
>>>> Target_x86_64<size>::Relocate::relocate: Change any indirect call
>>>> via GOT that can be converted.
>>>> * testsuite/Makefile.am (x86_64_indirect_call_to_direct.sh): New test.
>>>> * testsuite/Makefile.in: Regenerate.
>>>> * testsuite/x86_64_indirect_call_to_direct1.s: New file.
>>>> * testsuite/x86_64_indirect_jump_to_direct1.s: New file.
>>>>
>>>
>>> Do you need to check R_X86_64_REX_GOTPCRELX for branch?
>>
>> Ok, patch changed to not check for this and refactored a bit.
>
> Thanks for the refactoring -- I was about to suggest that. There's
> still too much logic outside the can_convert_* functions that is
> duplicated in Scan::global() and in Relocate::relocate(), all because
> we don't want to fetch the section contents without doing a few
> preliminary checks. I'd like to move most of that logic into the
> can_convert_* functions, so let's start with a Lazy_view class, which
> will fetch the section contents only when needed:
>
> template<int size>
> class Lazy_view
> {
>  public:
>   Lazy_view(Sized_relobj_file<size, false>* object, unsigned int data_shndx)
>     : object_(object), data_shndx_(data_shndx), view_(NULL), view_size_(0)
>   { }
>
>   inline unsigned char
>   operator[](size_t offset)
>   {
>     if (this->view_ == NULL)
>       this->view_ = this->object_->section_contents(this->data_shndx_,
>                                                     &this->view_size_,
>                                                     true);
>     if (offset >= this->view_size_)
>       return 0;
>     return this->view_[offset];
>   }
>
>  private:
>   Sized_relobj_file<size, false>* object_;
>   unsigned int data_shndx_;
>   const unsigned char* view_;
>   section_size_type view_size_;
> };
>
> Now we can move (almost) all of that external logic into the
> can_convert_* routines, leaving us with this in Scan::global():
>
>         Lazy_view<size> view(object, data_shndx);
>         size_t r_offset = reloc.get_r_offset();
>         if (r_offset >= 2
>             && Target_x86_64<size>::can_convert_mov_to_lea(gsym, r_type,
>                                                            r_offset, &view))
>           break;
>         if (r_offset >= 2
>             && Target_x86_64<size>::can_convert_callq_to_direct(gsym, r_type,
>
> r_offset, &view))
>           break;
>
> ... and this in Relocate::relocate():
>
>         if ((gsym == NULL
>              && rela.get_r_offset() >= 2
>              && view[-2] == 0x8b
>              && !psymval->is_ifunc_symbol())
>             || (gsym != NULL
>                 && rela.get_r_offset() >= 2
>                 && Target_x86_64<size>::can_convert_mov_to_lea(gsym, r_type,
>                                                                0, &view)))
>           {
>             ...
>           }
>         else if (gsym != NULL
>                  && rela.get_r_offset() >= 2
>                  && Target_x86_64<size>::can_convert_callq_to_direct(gsym,
>                                                                      r_type,
>                                                                      0, &view))
>           {
>             if (view[-1] == 0x15)
>               {
>                 ...
>               }
>             else
>               {
>                 ...
>               }
>           }
>
> Folding all the external logic into the can_convert_* functions, and
> refactoring a bit gives this:
>
>   template<class View_type>
>   static inline bool
>   can_convert_mov_to_lea(const Symbol* gsym, unsigned int r_type,
>                          size_t r_offset, View_type* view)
>   {
>     gold_assert(gsym != NULL);
>     // We cannot do the conversion unless it's one of these relocations.
>     if (r_type != elfcpp::R_X86_64_GOTPCREL
>         && r_type != elfcpp::R_X86_64_GOTPCRELX
>         && r_type != elfcpp::R_X86_64_REX_GOTPCRELX)
>       return false;
>     // We cannot convert references to IFUNC symbols, or to symbols that
>     // are not local to the current module.
>     if (gsym->type() == elfcpp::STT_GNU_IFUNC
>         || gsym->is_undefined ()
>         || gsym->is_from_dynobj()
>         || gsym->is_preemptible())
>       return false;
>     if (parameters->options().shared()
>         && (gsym->visibility() == elfcpp::STV_DEFAULT
>             || gsym->visibility() == elfcpp::STV_PROTECTED)
>         && !parameters->options().Bsymbolic())
>       return false;
>     // We cannot convert references to the _DYNAMIC symbol.
>     if (strcmp(gsym->name(), "_DYNAMIC") == 0)
>       return false;
>     // Check for a MOV opcode.
>     return (*view)[r_offset - 2] == 0x8b;
>   }
>
>   template<class View_type>
>   static inline bool
>   can_convert_callq_to_direct(const Symbol* gsym, unsigned int r_type,
>                               size_t r_offset, View_type* view)
>   {
>     gold_assert(gsym != NULL);
>     // We cannot do the conversion unless it's a GOTPCRELX relocation.
>     if (r_type != elfcpp::R_X86_64_GOTPCRELX)
>       return false;
>     // We cannot convert references to IFUNC symbols, or to symbols that
>     // are not local to the current module.
>     if (gsym->type() == elfcpp::STT_GNU_IFUNC
>         || gsym->is_undefined ()
>         || gsym->is_from_dynobj()
>         || gsym->is_preemptible())
>       return false;
>     // Check for a CALLQ or JMPQ opcode.
>     return ((*view)[r_offset - 2] == 0xff
>             && ((*view)[r_offset - 1] == 0x15
>                 || (*view)[r_offset - 1] == 0x25));
>   }
>
> A couple of observations, though:
>
> 1. Sri, in your patch, you just test for sym type == STT_FUNC. Isn't
> it sufficient to test for sym type != STT_GNU_IFUNC (as in the
> convert-to-lea case)? I don't think it really matters -- if we see a
> jump to an STT_OBJECT or STT_NOTYPE symbol, why isn't the
> transformation just as valid?
>
> 2. HJ, given an R_X86_64_GOTPCRELX relocation, is it still necessary
> to check the opcode during Scan::global()? Doesn't the relocation
> guarantee that it's an appropriate instruction for the transformation?
> I think in both cases, we could skip fetching the section contents if
> we have this relocation.
>
> 3. This piece of can_convert_mov_to_lea has me a bit puzzled:
>
>     if (parameters->options().shared()
>         && (gsym->visibility() == elfcpp::STV_DEFAULT
>             || gsym->visibility() == elfcpp::STV_PROTECTED)
>         && !parameters->options().Bsymbolic())
>       return false;
>
> We've already tested for is_preemptible, which should take care of all
> of these cases. If I remove this code, however, I get a couple of
> failures that I need to investigate. At the very least, I suspect this
> part of the logic can be simplified. I want to investigate this
> further before committing the whole thing.

This case is needed if the shared object specifies "protected"
visibility using an asm directive instead of the attribute.  In this
case, the compiler prefers to use the GOT.  Here is the relevant code
from the test that fails if this condition is removed:

// The function f2 is used to test that the executable can see the
// same function address for a protected function in the executable
// and in the shared library.  We can't use the visibility attribute
// here, becaues that may cause gcc to generate a PC relative reloc;
// we need it to get the value from the GOT.  I'm not sure this is
// really useful, given that it doesn't work with the visibility
// attribute.  This test exists here mainly because the glibc
// testsuite has the same test, and we want to make sure that gold
// passes the glibc testsuite.

extern "C" int f2();
asm(".protected f2");


Like I said earlier, the condition can be simplified and I have done that.


I am attaching the patch after making all the changes mentioned.
Please take a look.

* x86_64.cc (Lazy_view): New class.
(can_convert_mov_to_lea): Templatize function.  Make the function
check for appropriate relocation types and use the view parameter
to get section contents.
(can_convert_callq_to_direct): New function.
(Target_x86_64<size>::Scan::global): Refactor.
(Target_x86_64<size>::Relocate::relocate): Refactor. Change any indirect
call via GOT that can be converted.
* testsuite/Makefile.am (x86_64_indirect_call_to_direct.sh): New test.
* testsuite/Makefile.in: Regenerate.
* testsuite/x86_64_indirect_call_to_direct1.s: New file.
* testsuite/x86_64_indirect_jump_to_direct1.s: New file.


Thanks
Sri



>
> -cary

Attachment: convert_indirect_call_patch.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]