This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: [PATCH] PR gold/17640
- From: Cary Coutant <ccoutant at google dot com>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>
- Cc: Ilya Tocar <tocarip dot intel at gmail dot com>, Binutils <binutils at sourceware dot org>
- Date: Tue, 10 Mar 2015 09:31:00 -0700
- Subject: Re: [PATCH] PR gold/17640
- Authentication-results: sourceware.org; auth=none
- References: <20150219142707 dot GA507 at msticlxl7 dot ims dot intel dot com> <CAMe9rOp+6-mKKqE4h1jUvt-wBWVU0YK62yJPgrN0LVuU4JhxRw at mail dot gmail dot com> <CAHACq4poMBVYcg=nS01tPsLuNi=BdtL4gSTz5Q1-auGBX=zA-Q at mail dot gmail dot com> <20150226104626 dot GA16554 at msticlxl7 dot ims dot intel dot com> <CAHACq4ojyujhxYSsMaF_jSVWUg0=YgNawR6=6XYdZie1PMhXYQ at mail dot gmail dot com> <20150227142003 dot GA122934 at msticlxl7 dot ims dot intel dot com> <CAMe9rOrzXOu+CkT_A7XZLJ1MK_nqnMk69LjgXUSQ-FnS1fKsEQ at mail dot gmail dot com> <CAMe9rOq-uSd64h6eeogR3SywtQipMupN9Ng9-EvPLF+3bp7Exw at mail dot gmail dot com> <20150305104052 dot GA16361 at msticlxl7 dot ims dot intel dot com> <CAMe9rOqRQ+HMS12+hn9Nz0pnb95tUDmbAU2QjK9G1Rc-q=woCA at mail dot gmail dot com> <20150310112745 dot GA104717 at msticlxl7 dot ims dot intel dot com> <CAMe9rOoG6NOVLYxZBJ5KRuS9_47=ptsd=Ft9iA9HfN2Asw0UeQ at mail dot gmail dot com>
As an aside, why is this bug filed for 32-bit only? HJ, didn't you
implement this optimization for both 32-bit and 64-bit?
-cary
On Tue, Mar 10, 2015 at 6:47 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Mar 10, 2015 at 4:27 AM, Ilya Tocar <tocarip.intel@gmail.com> wrote:
>>> >> >>> Please write a ChangeLog entry.
>>> >> >>>
>>> >> >> Done.
>>> >> >>
>>> >> >>> > + // If the relocation symbol isn't IFUNC,
>>> >> >>> > + // and is local, then we will convert
>>> >> >>> > + // mov foo@GOT(%reg), %reg
>>> >> >>> > + // to
>>> >> >>> > + // lea foo@GOTOFF(%reg), %reg
>>> >> >>> > + // in Relocate::relocate
>>> >> >>> > + if (view[reloc.get_r_offset() - 2] == 0x8b
>>> >> >>>
>>> >> >>> You also need to check that reloc.get_r_offset() >= 2. If that's
>>> >> >>> false, or if the symbol is an IFUNC symbol, you could avoid getting
>>> >> >>> the section contents.
>>> >> >>>
>>> >> >> Done.
>>> >> >>
>>> >> >>> > + // If we convert this from
>>> >> >>> > + // mov foo@GOT(%reg), %reg
>>> >> >>> > + // to
>>> >> >>> > + // lea foo@GOTOFF(%reg), %reg
>>> >> >>> > + // in Relocate::relocate, then there is nothing to do here.
>>> >> >>> > + // Avoid converting _DYNAMIC, because it's address may be used.
>>> >> >>> > + if (view[reloc.get_r_offset() - 2] == 0x8b
>>> >> >>> > + && gsym->type() != elfcpp::STT_GNU_IFUNC
>>> >> >>> > + && !gsym->is_undefined()
>>> >> >>> > + && !gsym->is_from_dynobj()
>>> >> >>> > + && strcmp(gsym->name(), "_DYNAMIC"))
>>> >> >>> > + break;
>>> >> >>>
>>> >> >>> Same here.
>>> >> >>>
>>> >> >>> s/it's/its/
>>> >> >>>
>>> >> >> Fixed.
>>> >> >>
>>> >> >>> Could you explain more about the exception for _DYNAMIC? If its
>>> >> >>> address is used by some other relocation, won't we end up creating the
>>> >> >>> GOT entry anyway when we process that relocation? And if it's not,
>>> >> >>> isn't it OK to omit the GOT entry?
>>> >> >>>
>>> >> >> Comment in bfd/elf32-i386.c (elf_i386_convert_mov_to_lea:2714) says:
>>> >> >>
>>> >> >> We also avoid optimizing _DYNAMIC since ld.so may use its link-time
>>> >> >> address.
>>> >> >>
>>> >> >> I've checked mov1 tests in ld/testsuite/ld-i386/
>>> >> >> and without this check we optimize it (incorrectly) away.
>>> >> >>
>>> >> >>> > + // If the relocation symbol isn't IFUNC,
>>> >> >>> > + // and is local, then we convert
>>> >> >>> > + // mov foo@GOT(%reg), %reg
>>> >> >>> > + // to
>>> >> >>> > + // lea foo@GOTOFF(%reg), %reg
>>> >> >>> > + if (view[-2] == 0x8b
>>> >> >>>
>>> >> >>> Again, you need to check that r_offset is in range. See calls to
>>> >> >>> tls::check_tls() later in the source file.
>>> >> >>>
>>> >> >> Check added.
>>> >> >>
>>> >> >>> > + && ((gsym == NULL && !psymval->is_ifunc_symbol())
>>> >> >>> > + || (gsym != NULL
>>> >> >>> > + && gsym->type() != elfcpp::STT_GNU_IFUNC
>>> >> >>> > + && !gsym->is_from_dynobj()
>>> >> >>> > + && !gsym->is_undefined())))
>>> >> >>>
>>> >> >>> What about _DYNAMIC? You need to make sure you make the same decision
>>> >> >>> here that you made in Scan::local or Scan::global.
>>> >> >> Why? What's wrong with optimizing access into lea, but leaving GOT
>>> >> >> entry?
>>> >> >
>>> >> > You should add a testcase for _DYNAMIC, as in
>>> >> >
>>> >> You should also add testcases for PIE, -Bsymbolic with DSO
>>> >> and normal executable. You also need testcases with
>>> >> relocation against global symbol with and without visibility,.
>>> >>
>>> >> Your patch incorrectly converts mov to lea with global symbol
>>> >> defined within DSO when building DSO.
>>> >>
>>> > Good catch!
>>> > Fixed. I've also added tests for _DYNAMIC, pie, Bsymbolic.
>>> > Ok for trunk?
>>> >
>>> > ---
>>> > +
>>> > + // If we convert this from
>>> > + // mov foo@GOT(%reg), %reg
>>> > + // to
>>> > + // lea foo@GOTOFF(%reg), %reg
>>> > + // in Relocate::relocate, then there is nothing to do here.
>>> > + // Avoid converting _DYNAMIC, because its address may be used.
>>> > + if (reloc.get_r_offset() >= 2
>>> > + && gsym->type() != elfcpp::STT_GNU_IFUNC
>>> > + && !gsym->is_undefined()
>>> > + && !gsym->is_from_dynobj()
>>> > + && !gsym->is_preemptible()
>>> > + && (!parameters->options().shared()
>>> > + || gsym->visibility() != elfcpp::STV_DEFAULT
>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>
>>> Protected symbol means that it can't be pre-emptied. It
>>> doesn't mean its address won't be external. This is true
>>> for pointer to protected function. With copy relocation,
>>> address of protected data defined in the shared library may
>>> also be external. We only know that for sure at run-time.
>>>
>>> See:
>>>
>>> https://sourceware.org/ml/binutils/2015-03/msg00036.html
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65248
>>>
>>> > + || parameters->options().Bsymbolic())
>>> > + && strcmp(gsym->name(), "_DYNAMIC"))
>>> >
>>>
>>>
>> You are right, fixed.
>> Ok for trunk?
>>
>> ---
>> gold/ChangeLog | 16 +++++
>> gold/i386.cc | 124 ++++++++++++++++++++++++++++----------
>> gold/testsuite/Makefile.am | 59 ++++++++++++++++++
>> gold/testsuite/i386_mov_to_lea.sh | 37 ++++++++++++
>> gold/testsuite/i386_mov_to_lea1.s | 11 ++++
>> gold/testsuite/i386_mov_to_lea2.s | 10 +++
>> gold/testsuite/i386_mov_to_lea3.s | 4 ++
>> gold/testsuite/i386_mov_to_lea4.s | 12 ++++
>> gold/testsuite/i386_mov_to_lea5.s | 12 ++++
>> 9 files changed, 253 insertions(+), 32 deletions(-)
>> create mode 100755 gold/testsuite/i386_mov_to_lea.sh
>> create mode 100644 gold/testsuite/i386_mov_to_lea1.s
>> create mode 100644 gold/testsuite/i386_mov_to_lea2.s
>> create mode 100644 gold/testsuite/i386_mov_to_lea3.s
>> create mode 100644 gold/testsuite/i386_mov_to_lea4.s
>> create mode 100644 gold/testsuite/i386_mov_to_lea5.s
>>
>> diff --git a/gold/ChangeLog b/gold/ChangeLog
>> index fe6a56b..7a8ff72 100644
>> --- a/gold/ChangeLog
>> +++ b/gold/ChangeLog
>> @@ -1,3 +1,19 @@
>> +2015-03-05 Ilya Tocar <ilya.tocar@intel.com>
>> +
>> + PR gold/17640
>> + * i386.cc (Target_i386::Scan::local): Don't create GOT entry, when we
>> + can convert GOT to GOTOFF.
>> + (Target_i386::Scan::global): Ditto.
>> + (Target_i386::Relocate::relocate): Convert mov foo@GOT(%reg), %reg to
>> + lea foo@GOTOFF(%reg), %reg if possible.
>> + * testsuite/Makefile.am (i386_mov_to_lea): New test.
>> + * testsuite/i386_mov_to_lea1.s: New.
>> + * testsuite/i386_mov_to_lea2.s: Ditto.
>> + * testsuite/i386_mov_to_lea3.s: Ditto.
>> + * testsuite/i386_mov_to_lea4.s: Ditto.
>> + * testsuite/i386_mov_to_lea5.s: Ditto.
>> + * testsuite/i386_mov_to_lea.sh: Ditto.
>> +
>> 2015-03-04 Cary Coutant <ccoutant@google.com>
>>
>> * parameters.cc (Parameters::set_target_once): Call
>> diff --git a/gold/i386.cc b/gold/i386.cc
>> index 24f4103..868e250 100644
>> --- a/gold/i386.cc
>> +++ b/gold/i386.cc
>> @@ -1835,8 +1835,28 @@ Target_i386::Scan::local(Symbol_table* symtab,
>>
>> case elfcpp::R_386_GOT32:
>> {
>> - // The symbol requires a GOT entry.
>> +
> ^^^^^^^^^^^^^^^^^^^^^ Extra blank line.
>> + // We need GOT section.
>> Output_data_got<32, false>* got = target->got_section(symtab, layout);
>> +
>> + // If the relocation symbol isn't IFUNC,
>> + // and is local, then we will convert
>> + // mov foo@GOT(%reg), %reg
>> + // to
>> + // lea foo@GOTOFF(%reg), %reg
>> + // in Relocate::relocate
>> + if (reloc.get_r_offset() >= 2
>> + && lsym.get_st_type() != elfcpp::STT_GNU_IFUNC)
>> + {
>> + section_size_type stype;
>> + const unsigned char* view = object->section_contents(data_shndx,
>> + &stype, true);
>> + if (view[reloc.get_r_offset() - 2] == 0x8b)
>> + break;
>> +
>> + }
>> +
>> + // Otherwise, the symbol requires a GOT entry.
>> unsigned int r_sym = elfcpp::elf_r_sym<32>(reloc.get_r_info());
>>
>> // For a STT_GNU_IFUNC symbol we want the PLT offset. That
>> @@ -2229,8 +2249,34 @@ Target_i386::Scan::global(Symbol_table* symtab,
>>
>> case elfcpp::R_386_GOT32:
>> {
>> +
> ^^^^^^^^^^^^^^^^^^^^^^^^ Extra blank line.
>> // The symbol requires a GOT entry.
>> Output_data_got<32, false>* got = target->got_section(symtab, layout);
>> +
>> + // If we convert this from
>> + // mov foo@GOT(%reg), %reg
>> + // to
>> + // lea foo@GOTOFF(%reg), %reg
>> + // in Relocate::relocate, then there is nothing to do here.
>> + // Avoid converting _DYNAMIC, because its address may be used.
>> + if (reloc.get_r_offset() >= 2
>> + && gsym->type() != elfcpp::STT_GNU_IFUNC
>> + && !gsym->is_undefined()
>> + && !gsym->is_from_dynobj()
>> + && !gsym->is_preemptible()
>> + && (!parameters->options().shared()
>> + || (gsym->visibility() != elfcpp::STV_DEFAULT
>> + && gsym->visibility() != elfcpp::STV_PROTECTED)
>> + || parameters->options().Bsymbolic())
>> + && strcmp(gsym->name(), "_DYNAMIC"))
>> + {
>> + section_size_type stype;
>> + const unsigned char* view = object->section_contents(data_shndx,
>> + &stype, true);
>> + if (view[reloc.get_r_offset() - 2] == 0x8b)
>> + break;
>> + }
>> +
>> if (gsym->final_value_is_known())
>> {
>> // For a STT_GNU_IFUNC symbol we want the PLT address.
>> @@ -2732,35 +2778,6 @@ Target_i386::Relocate::relocate(const Relocate_info<32, false>* relinfo,
>> }
>> }
>>
>> - // Get the GOT offset if needed.
>> - // The GOT pointer points to the end of the GOT section.
>> - // We need to subtract the size of the GOT section to get
>> - // the actual offset to use in the relocation.
>> - bool have_got_offset = false;
>> - unsigned int got_offset = 0;
>> - switch (r_type)
>> - {
>> - case elfcpp::R_386_GOT32:
>> - if (gsym != NULL)
>> - {
>> - gold_assert(gsym->has_got_offset(GOT_TYPE_STANDARD));
>> - got_offset = (gsym->got_offset(GOT_TYPE_STANDARD)
>> - - target->got_size());
>> - }
>> - else
>> - {
>> - unsigned int r_sym = elfcpp::elf_r_sym<32>(rel.get_r_info());
>> - gold_assert(object->local_has_got_offset(r_sym, GOT_TYPE_STANDARD));
>> - got_offset = (object->local_got_offset(r_sym, GOT_TYPE_STANDARD)
>> - - target->got_size());
>> - }
>> - have_got_offset = true;
>> - break;
>> -
>> - default:
>> - break;
>> - }
>> -
>> switch (r_type)
>> {
>> case elfcpp::R_386_NONE:
>> @@ -2809,8 +2826,51 @@ Target_i386::Relocate::relocate(const Relocate_info<32, false>* relinfo,
>> break;
>>
>> case elfcpp::R_386_GOT32:
>> - gold_assert(have_got_offset);
>> - Relocate_functions<32, false>::rel32(view, got_offset);
>> + // If the relocation symbol isn't IFUNC,
>> + // and is local, then we convert
>> + // mov foo@GOT(%reg), %reg
>> + // to
>> + // lea foo@GOTOFF(%reg), %reg
>> + if (rel.get_r_offset() >= 2
>> + && view[-2] == 0x8b
>> + && ((gsym == NULL && !psymval->is_ifunc_symbol())
>> + || (gsym != NULL
>> + && gsym->type() != elfcpp::STT_GNU_IFUNC
>> + && !gsym->is_from_dynobj()
>> + && !gsym->is_undefined ()
>> + && (!parameters->options().shared()
>> + || (gsym->visibility() != elfcpp::STV_DEFAULT
>> + && gsym->visibility() != elfcpp::STV_PROTECTED)
>> + || parameters->options().Bsymbolic())
>> + && !gsym->is_preemptible())))
>> + {
>> + view[-2] = 0x8d;
>> + elfcpp::Elf_types<32>::Elf_Addr value;
>> + value = (psymval->value(object, 0)
>> + - target->got_plt_section()->address());
>> + Relocate_functions<32, false>::rel32(view, value);
>> + }
>> + else
>> + {
>> + // The GOT pointer points to the end of the GOT section.
>> + // We need to subtract the size of the GOT section to get
>> + // the actual offset to use in the relocation.
>> + unsigned int got_offset = 0;
>> + if (gsym != NULL)
>> + {
>> + gold_assert(gsym->has_got_offset(GOT_TYPE_STANDARD));
>> + got_offset = (gsym->got_offset(GOT_TYPE_STANDARD)
>> + - target->got_size());
>> + }
>> + else
>> + {
>> + unsigned int r_sym = elfcpp::elf_r_sym<32>(rel.get_r_info());
>> + gold_assert(object->local_has_got_offset(r_sym, GOT_TYPE_STANDARD));
>> + got_offset = (object->local_got_offset(r_sym, GOT_TYPE_STANDARD)
>> + - target->got_size());
>> + }
>> + Relocate_functions<32, false>::rel32(view, got_offset);
>> + }
>> break;
>>
>> case elfcpp::R_386_GOTOFF:
>
> Otherwise it looks good to me. But it needs Cary's approval.
>
> Thanks.
>
> --
> H.J.