Mips target in gold - revision 2 - part 3

Cary Coutant ccoutant@google.com
Tue Mar 11 02:17:00 GMT 2014


> Attached is the third part of the patch that implements Mips target in gold.  This part contains second part of the mips.cc file.

Here are my review comments on mips.cc. Again, I apologize for taking
so long. The other parts of this patch (elfcpp, configure, Makefile,
etc.) look fine. I may go ahead and commit the elfcpp part of the
patch separately.

-cary


>   // Discard MIPS16 stub secions that are not needed.
>   void
>   discard_mips16_stub_sections(Symbol_table* symtab)

This function is so long that your class will be more readable if
you move the function definition outside the class body.  (You can
declare it 'inline' if you really want it to be inlined, but it
doesn't seem necessary for this function.)

>   {
>     for (typename Mips16_stubs_int_map::const_iterator
>          it = this->mips16_stub_sections_.begin();
>          it != this->mips16_stub_sections_.end(); ++it)
>       {
>         Mips16_stub_section<size, big_endian>* stub_section = it->second;
>         if (!stub_section->is_target_found())
>           {
>             gold_error(_("no relocation found in mips16 stub section '%s'"),
>                        stub_section->object()->section_name(stub_section->shndx()).c_str());
>           }

Line too long.

>   // Whether the object is a PIC object.
>   bool is_pic_;
>   // Whether the object uses N32 ABI.
>   bool is_n32_;
>   // Whether the object uses N64 ABI.
>   bool is_n64_;

These can be bitfields:

  // Whether the object is a PIC object.
  bool is_pic_ : 1;
  // Whether the object uses N32 ABI.
  bool is_n32_ : 1;
  // Whether the object uses N64 ABI.
  bool is_n64_ : 1;

>   static const uint32_t lazy_stub_normal_1[4];
>   static const uint32_t lazy_stub_normal_1_n64[4];
>   static const uint32_t lazy_stub_normal_2[4];
>   static const uint32_t lazy_stub_normal_2_n64[4];
>   static const uint32_t lazy_stub_big[5];
>   static const uint32_t lazy_stub_big_n64[5];

Everywhere else, you've left the array bounds unspecified.
Unless there's a compelling reason for the bounds here, I'd
suggest being consistent with the rest of your static const
array declarations.

> // Mips_copy_relocs class.  This class is almost identical to the class
> // Copy_relocs from copy-relocs.h.  The only difference is method
> // Copy_reloc_entry::emit.

This is an awful lot of duplication in order to replace just the
one method. Here's what I suggest instead:

1. In copy-relocs.h, remove Copy_reloc_entry::emit, make
   Copy_reloc_entry protected instead of private, and make
   its member fields public (i.e., just make it a struct).

2. Make Copy_relocs::make_copy_reloc a protected member.

3. In copy-relocs.cc, remove Copy_reloc_entry::emit and
   inline it into Copy_relocs::emit:

template<int sh_type, int size, bool big_endian>
void
Copy_relocs<sh_type, size, big_endian>::emit(
    Output_data_reloc<sh_type, true, size, big_endian>* reloc_section)
{
  for (typename Copy_reloc_entries::iterator p = this->entries_.begin();
       p != this->entries_.end();
       ++p)
    {
      // If the symbol is no longer defined in a dynamic object, then we
      // emitted a COPY relocation, and we do not want to emit this
      // dynamic relocation.
      if (p->sym_->is_from_dynobj())
reloc_section->add_global_generic(p->sym_, p->reloc_type_,
 p->output_section_, p->relobj_,
 p->shndx_, p->address_,
 p->addend_);
    }

  // We no longer need the saved information.
  this->entries_.clear();
}

4. Declare the Mips_copy_relocs class as a subclass of
   Copy_relocs, adding just one new method: emit_mips.
   Let the subclass inherit everything else as is.

5. Inline everything from Mips_copy_relocs::Copy_reloc_entry::emit
   into Mips_copy_relocs::emit_mips.

6. In Target_mips::do_finalize_sections, call
   copy_relocs_.emit_mips() instead of copy_relocs_.emit().

>   Mips_copy_relocs(unsigned int copy_reloc_type)
>     : copy_reloc_type_(copy_reloc_type), dynbss_(NULL), entries_()
>   { }

>       copy_relocs_(elfcpp::R_MIPS_COPY), copy_relocsa_(elfcpp::R_MIPS_COPY),

Since you now have your own Mips_copy_relocs subclass, you don't
need the copy_reloc_type parameter in the constructor; instead,
just hardcode elfcpp::R_MIPS_COPY in the initializer for the
superclass.

>   bool
>   relocation_needs_la25_stub(Mips_relobj<size, big_endian>* object,
>                              unsigned int r_type, bool target_is_16_bit_code)

This looks like it could be a free function. Just make it "static
inline" outside the class.

>   bool
>   local_pic_function(Mips_symbol<size>* sym)

Likewise.

>   static inline bool
>   hi16_reloc(int r_type)
...
>   static inline bool
>   micromips_reloc(unsigned int r_type)

Likewise.

>   void
>   reorder_dyn_symbols(std::vector<Symbol*>* dyn_symbols,
>                       std::vector<Symbol*>* non_got_symbols,
>                       std::vector<Symbol*>* got_symbols) const

Likewise.

>   unsigned int
>   do_set_dynsym_indexes(std::vector<Symbol*>* dyn_symbols, unsigned int index,
> std::vector<Symbol*>* syms, Stringpool* dynpool,
> Versions* versions, Symbol_table* symtab) const

Please move the body of this function outside the class.

>   bool
>   use_32bit_micromips_instructions() const
>   { return insn32_; }

Please use this->insn32_.

>   typename std::list<got16_addend<size, big_endian> > got16_addends_;

Do you realy want to use a linked list here? Wouldn't a
vector be better?

>   //   big-endian file, the result is the same; in a little-endian
>   //   file, the two 16-bit halves of the 32 bit value are swapped.

Do you mean that the bytes within each are swapped, but the
16-bit words are stored in order, or that the 16-bit words are
swapped (i.e., the second written before the first)? I'm guessing
that you mean that the byte-swapping takes place within each
16-bit word, and the order of the two words is unaffected.

>   //   To put it in MIPS ABI terms, the relocation field is T-targ26-16,
>   //   defined as
>   //
>   //   big-endian:
>   //   +--------+----------------------+
>   //   |        |                      |
>   //   |        |    targ26-16         |
>   //   |31    26|25                   0|
>   //   +--------+----------------------+
>   //
>   //   little-endian:
>   //   +----------+------+-------------+
>   //   |          |      |             |
>   //   |  sub1    |      |     sub2    |
>   //   |0        9|10  15|16         31|
>   //   +----------+--------------------+
>   //   where targ26-16 is sub1 followed by sub2 (i.e., the addend field A is
>   //   ((sub1 << 16) | sub2)).

I can't figure out what this diagram is saying. In particular,
why are the bits numbered right-to-left for big endian and
left-to-right for little endian? Most processors I'm familiar
with use the same bit numbering convention for both big- and
little-endian, but I certainly wouldn't expect to see this.

>   std::vector<bool> empty_vector(loccount, false);
>   std::vector<bool> empty_vector2(loccount, false);
>   this->local_symbol_is_mips16_.swap(empty_vector);
>   this->local_symbol_is_micromips_.swap(empty_vector2);

It's cleaner and no less efficient to just call resize() on the
original vector. The "swap" trick is only necessary when you need
to free any previously-allocated space.

>   std::vector<bool> empty_vector(this->shnum(), false);
>   std::vector<bool> empty_vector2(this->shnum(), false);
>   std::vector<bool> empty_vector3(this->shnum(), false);
>   this->section_is_mips16_fn_stub_.swap(empty_vector);
>   this->section_is_mips16_call_stub_.swap(empty_vector2);
>   this->section_is_mips16_call_fp_stub_.swap(empty_vector3);

Likewise.

>   //gold_assert(static_cast<section_size_type>(pov - oview) == oview_size);

Was this commented out while debugging? If you can't leave it
enabled, please add a TODO noting what the problem is. At the
least, you should make sure you didn't overrun the oview, and if
you underran it, you should fill with zeroes.

>   // We always allocate 20 bytes for every stub, because final dynsym count is
>   // not known in method do_finalize_sections.  There are 4 unused bytes per
>   // stub if final dynsym count is less than 0x10000.
>   // TODO(sasa): Can we strip unused bytes during the relaxation?
>   unsigned int used = this->symbols_.size() * (this->stub_max_size()
>                                                - (big_stub ? 0 : 4));
>   unsigned int unused = big_stub ? 0 : this->symbols_.size() * 4;
>   gold_assert(static_cast<section_size_type>(used + unused) == oview_size);

I'd recommend filling the unused space with zeroes.

>   Valtype gprmask = 0, cprmask1 = 0, cprmask2 = 0, cprmask3 = 0, cprmask4 = 0;

Please declare these on separate lines.

>       layout->add_output_section_data(".got", elfcpp::SHT_PROGBITS,
>                                       (elfcpp::SHF_ALLOC | elfcpp::SHF_WRITE |
>                                       elfcpp::SHF_MIPS_GPREL),
>                                       this->got_, ORDER_DATA, false);
> ...
>       // If there is no .got section, gp should be based on .sdata.
>       // TODO(sasa): If there are both .got and .sdata sections, they must be
>       // together, with .got comming first.

If you want .got and .sdata together, shouldn't you define .got
with ORDER_SMALL_DATA? (To guarantee that it precedes .sdata, you
may need to add ORDER_SMALL_DATA_FIRST to layout.h.)

Typo: "comming".

>   // Don't emit input .reginfo sections to output .reginfo.
>   for (Input_objects::Relobj_iterator p = input_objects->relobj_begin();
>        p != input_objects->relobj_end();
>        ++p)
>     {
>       Mips_relobj<size, big_endian>* object =
>         Mips_relobj<size, big_endian>::as_mips_relobj(*p);
>       for (unsigned int i = 1; i < object->shnum(); ++i)
>         {
>           if (object->section_type(i) == elfcpp::SHT_MIPS_REGINFO)
>             {
>               object->set_output_section(i, NULL);
>               break;
>             }
>         }
>     }

It looks to me like Layout::include_section (in layout.cc) ought
to call a target hook in the default case of the switch on
get_sh_type() to determine whether or not to include the section.
It seems wasteful to loop over all the sections of all the objects
here.

>   Target::do_finalize_sections(layout, input_objects, symtab);

I don't think there's any need for this. The default
implementation of this method is empty. (Fine if you want to
leave it in on principle, but I doubt that method will ever
have a non-empty default implementation -- anything
target-independent would be in Layout::finalize.)

>   Output_section* dynsym = this->layout_->find_output_section(".dynsym");
>   gold_assert(dynsym != NULL);

This does a slow (linear) search for the .dynsym section each
time do_dynamic_tag_custom_value() is called, and it's only
needed for two of the three tags you've defined. You can just
call Layout::dynsym_section() to get the pointer without a search.
(It even guarantees a non-NULL result.)

>         // Find all readable PT_LOAD segments.
>         segment_list.push_back(this->layout_->find_output_segment(
>             elfcpp::PT_LOAD, elfcpp::PF_R, elfcpp::PF_W | elfcpp::PF_X));
>         segment_list.push_back(this->layout_->find_output_segment(
>             elfcpp::PT_LOAD, elfcpp::PF_R | elfcpp::PF_X, elfcpp::PF_W));
>         segment_list.push_back(this->layout_->find_output_segment(
>             elfcpp::PT_LOAD, elfcpp::PF_R | elfcpp::PF_W, elfcpp::PF_X));
>         segment_list.push_back(this->layout_->find_output_segment(
>             elfcpp::PT_LOAD, elfcpp::PF_R | elfcpp::PF_W | elfcpp::PF_X, 0));
>
>         unsigned int base_addr = -1U;
>         for (Segment_list::const_iterator p = segment_list.begin();
>              p != segment_list.end();
>              ++p)
>           {
>             if (*p != NULL && (*p)->vaddr() < base_addr)
>               base_addr = (*p)->vaddr();
>           }

At this point, the segment list has been sorted into final order,
so you should just be able to call:

  Output_segment* seg =
      this->layout_->find_output_segment(elfcpp::PT_LOAD, elfcpp::PF_R, 0);
  gold_assert(seg != NULL);
  return seg->vaddr();

>     if (target->call_lo16_reloc(r_type)
>         || target->got_lo16_reloc(r_type)
>         || target->got_disp_reloc(r_type))
>       {
>         // We may need a local GOT entry for this relocation.  We
>         // don't count R_MIPS_GOT_PAGE because we can estimate the
>         // maximum number of pages needed by looking at the size of
>         // the segment.  Similar comments apply to R_MIPS*_GOT16 and
>         // R_MIPS*_CALL16.  We don't count R_MIPS_GOT_HI16, or
>         // R_MIPS_CALL_HI16 because these are always followed by an
>         // R_MIPS_GOT_LO16 or R_MIPS_CALL_LO16.
>         Mips_output_data_got<size, big_endian>* got =
>           target->got_section(symtab, layout);
>         unsigned int r_sym = elfcpp::elf_r_sym<size>(r_info);
>         got->record_local_got_symbol(mips_obj, r_sym, r_addend, r_type, -1U);
>       }

This block is indented too far.

>             // In some cases GCC dead code elimination removes the LO16 but
>             // keeps the corresponding HI16.  This is strictly speaking a
>             // violation of the ABI but not immediately harmful.

If I understand the code correctly, in this case the HI16 reloc
will remain on the got16_addends_ list forever. Is there a point
where you know it's safe to remove it? Can you clear the list
at the end of each section?

>             else
>               // FIXME: TLS optimization not supported yet.
>               gold_unreachable();

(also a couple of locations just below this one)

I think this would look better with braces, since it's more than
one line. (The "goto fail;" bug has sensitized me to this
recently.)



More information about the Binutils mailing list