Mips target in gold - part 3

Cary Coutant ccoutant@google.com
Sat Dec 21 01:43:00 GMT 2013


Here's the first batch of comments on mips.cc.

-cary


> // The types of GOT entries needed for this platform.
> // These values are exposed to the ABI in an incremental link.
> // Do not renumber existing values without changing the version
> // number of the .gnu_incremental_inputs section.
> enum Got_type
> {
>   GOT_TYPE_STANDARD = 0,      // GOT entry for a regular symbol
>   GOT_TYPE_TLS_OFFSET = 1,    // GOT entry for TLS offset
>   GOT_TYPE_TLS_PAIR = 2,      // GOT entry for TLS module/offset pair
>
>   // GOT entries for multi-got. We support up to 1024 gots in multi-got links.
>   GOT_TYPE_STANDARD_MULTIGOT = 3,
>   GOT_TYPE_TLS_OFFSET_MULTIGOT = GOT_TYPE_STANDARD_MULTIGOT + 1024,
>   GOT_TYPE_TLS_PAIR_MULTIGOT = GOT_TYPE_TLS_OFFSET_MULTIGOT + 1024
> };

For incremental linking, there's an implicit assumption that
got_type will be in the range 0..126. In the incremental info,
the got_type is stored as a single byte, and 0x7f has a special
meaning, and the 0x80 bit indicates a local GOT entry.

If you want to support incremental linking, we'll need to rework
the way got_type is stored in the incremental info sections.
Until then, you should leave Target::got_entry_count,
Target::plt_entry_count, etc., unimplemented.


>   bool
>   is_for_local_symbol() const
>   { return this->object_ != NULL && this->symndx_ != -1U; }
>
>   bool
>   is_for_global_symbol() const
>   { return this->object_ != NULL && this->symndx_ == -1U; }

It doesn't appear to me that this->object_ can ever be NULL. If
that's the case, then I'd drop the test here, add an assert in
object(), and call this->object() instead of referencing
this->object_ directly.


>   size_t
>   hash() const
>   {
>     uintptr_t object_id = reinterpret_cast<uintptr_t>(this->object_);
>     uintptr_t sym_id = reinterpret_cast<uintptr_t>(this->d.sym);
>     return (this->symndx_
>             + ((this->tls_type_ == GOT_TLS_LDM) << 18)
>             + (this->tls_type_ == GOT_TLS_LDM
>                ? 0
>                : this->symndx_ != -1U
>                    ? (object_id + this->d.addend)
>                    : sym_id));
>   }

This return expression is hard to read, thanks in part to the
nested ternary operators. At the least, the inner ternary
operator needs parentheses and the ? and : should be lined up
with the 't' in "this".

I think this would be more readable by breaking it down,
something like this:

    if (this->tls_type_ == GOT_TLS_LDM)
      return this->symndx_ + (1 << 18);
    if (this->symndx_ != -1U)
      {
        uintptr_t object_id = reinterpret_cast<uintptr_t>(this->object_);
        return this->symndx_ + object_id + this->d.addend;
      }
    else
      {
        uintptr_t sym_id = reinterpret_cast<uintptr_t>(this->d.sym);
        return sym_id - 1; // Does subtracting 1 here improve the hash?
      }

And I'd suggest a similar restructuring for equals().


>   // The input object that needs the GOT entry.
>   Sized_relobj_file<size, big_endian>*
>   object() const
>   { return this->object_; }

As suggested above, I'd add an assert here that this->object_ !=
NULL.


>   // The index of the symbol if we have a local symbol; -1 otherwise.
>   unsigned int
>   symndx() const
>   { return this->symndx_; }

I looked at the uses of this routine, and it looks like they all
expect the returned value to be a valid symbol index. (I might
have missed something, though.) Any client of this class should
be using is_for_local_symbol() to test whether symndx() should be
called, so I'd suggest adding an assert that this->symndx_ != -1U
here, and adjusting the comment. (You do have similar assertions
in addend() and sym().)


>   bool
>   tls_entry() const
>   { return this->tls_type_ != GOT_TLS_NONE; }

Should be "is_tls_entry"?


>   union
>   {
>     // If object != NULL && symndx != -1, the addend of the relocation
>     // that should be added to the symbol value.
>     Mips_address addend;
>     // If object != NULL && symndx == -1, the global symbol
>     // corresponding to this GOT entry.  The symbol's entry
>     // is in the local area if mips_sym->global_got_area is GGA_NONE,
>     // otherwise it is in the global area.
>     Mips_symbol<size>* sym;
>   } d;

The comments here beg the question of what happens when object ==
NULL, but as far as I can tell, that shouldn't be the case.


>   unsigned int shndx_;

This member should have a comment.


>   operator()(Mips_got_entry<size, big_endian> *entry) const
>   operator()(Mips_got_entry<size, big_endian> *e1,
>              Mips_got_entry<size, big_endian> *e2) const
>   Got_page_range *next;
(and elsewhere)

For C++ code, the '*' goes with the type, not the variable name.


> struct Got_page_entry_eq
> {
>   bool
>   operator()(Got_page_entry *entry1, Got_page_entry *entry2) const
>   {
>     uintptr_t object_id1 = reinterpret_cast<uintptr_t>(entry1->object);
>     uintptr_t object_id2 = reinterpret_cast<uintptr_t>(entry2->object);
>     return object_id1 == object_id2 && entry1->symndx == entry2->symndx;
>   }
> };

I don't see why you need to use a cast here. Why not just compare
entry1->object == entry2->object?


>   Mips_got_info()
>     : global_gotno_(0), reloc_only_gotno_(0), local_gotno_(0),
>       page_gotno_(0), tls_gotno_(0), tls_ldm_offset_(-1U),
>       got_entries_(256, Mips_got_entry_hash<size, big_endian>(),
>                         Mips_got_entry_eq<size, big_endian>()),

I don't think you need to pass hash and equals functors to the
initializer for got_entries_. Those are already declared in the
typedef for Got_entry_set. "got_entries_(256)" should work fine.


>   // Reserve space for a GOT entry containing the value of symbol
>   // SYMNDX in input object OBJECT, plus ADDEND.
>   void
>   record_local_got_symbol(Sized_relobj_file<size, big_endian>* object,
>                           unsigned int symndx,
>                           typename elfcpp::Elf_types<size>::Elf_Addr addend,
>                           unsigned int r_type,
>                           unsigned int shndx);
>   void
>   record_global_got_symbol(Mips_symbol<size>* mips_sym,
>                            Sized_relobj_file<size, big_endian>* object,
>                            unsigned int r_type, bool reloc, bool for_call);
>   void
>   record_got_entry(Sized_relobj_file<size, big_endian> *object,
>                    Mips_got_entry<size, big_endian> *entry);
>   Mips_got_info<size, big_endian>*
>   get_got_info(const Sized_relobj_file<size, big_endian> *object, bool create);
>   void
>   add_local_entries(Target_mips<size, big_endian> *target, Layout* layout);
>   void
>   add_page_entries(Target_mips<size, big_endian> *target, Layout* layout);
>   void
>   add_global_entries(Target_mips<size, big_endian> *target, Layout* layout,
>                      unsigned int non_reloc_only_global_gotno);
>   void
>   add_reloc_only_entries(Mips_output_data_got<size, big_endian> *got);
>   void
>   add_tls_entries(Target_mips<size, big_endian> *target, Layout* layout);
>   void
>   lay_out_got(Target_mips<size, big_endian> *target, Layout* layout,
>               Symbol_table* symtab, const Input_objects* input_objects);
>   void
>   count_got_symbols(Symbol_table* symtab);
>   unsigned int
>   get_got_page_offset(unsigned int value, unsigned char *got_view);

Please add blank lines between declarations, and add a comment
for each function.


>   // Set whether we need the fn_stub.
>   void
>   set_need_fn_stub(bool value)
>   { this->need_fn_stub_ = value; }
>
>   void
>   set_has_nonpic_branches(bool value)
>   { this->has_nonpic_branches_ = value; }

You never call these functions with a false parameter -- why not
just have them set the flag to true? (Consistent with
set_has_la25_stub, set_has_static_relocs, set_no_lazy_stub, etc.)


>   Mips16_stub_section(Object* object, unsigned int shndx)
>     : object_(object), shndx_(shndx), r_sym_(0), gsym_(NULL),
>       is_fn_stub_(false), is_call_stub_(false), is_call_fp_stub_(false)
>   {
>     const char *section_name = object->section_name(shndx).c_str();
>     is_fn_stub_ = is_prefix_of(".mips16.fn", section_name);
>     is_call_stub_ = is_prefix_of(".mips16.call.", section_name);
>     is_call_fp_stub_ = is_prefix_of(".mips16.call.fp.", section_name);
>   }

Calling object->section_name() is a slow operation -- we don't keep
the section names in the Object anywhere, so it has to go read the ELF file
every time. Since you have the name available when you invoke the
constructor, it would be better to pass the name you already fetched once.
Even better, don't you have special section types or flags that you can
test instead of doing a string compare on the section name? If not, it
might be better to install a target hook in the Add_symbols pass (where
the section names are readily available) that could flag the sections
that have special meaning so that you don't need to look at the section
name later, when we have to go re-read them from the ELF file.


>   // Find target symbol for this stub.
>   void
>   find_target_from_relocs()
>   {
>     // Trust the first R_MIPS_NONE relocation, if any.
>     std::list<mips16_stub_reloc*>::iterator it;
>     mips16_stub_reloc* reloc = NULL;
>     for (it = this->relocs_.begin(); it != this->relocs_.end(); ++it)
>       {
>         if ((*it)->r_type == elfcpp::R_MIPS_NONE)
>           {
>             reloc = *it;
>             break;
>           }
>       }
>
>     // Otherwise trust the first relocation, whatever its kind.
>     if (reloc == NULL)
>       {
>         if (this->relocs_.size() > 0)
>           reloc = *this->relocs_.begin();
>         else
>           gold_error(_("no relocation found in mips16 stub section '%s'"),
>                      this->object_->section_name(this->shndx_).c_str());
>       }
>
>     if (reloc->is_local())
>       this->r_sym_ = reloc->r_sym;
>     else
>       this->gsym_ = reloc->gsym;
>   }
>
>   void
>   add_stub_reloc(mips16_stub_reloc* stub_reloc)
>   { this->relocs_.push_back(stub_reloc); }

It looks like the only reason to push relocations onto
this->relocs_ is for find_target_from_relocs, which is going to
prefer the first R_MIPS_NONE relocation, but failing that, will
use the first relocation. So once you've seen an R_MIPS_NONE
relocation, you immediately know what the target symbol is, and
you can avoid keeping a list. If you see another kind of
relocation first, you can tentatively set the target symbol from
that, but keep a flag set that says you're still looking for an
R_MIPS_NONE relocation. I assume there will be LOTS of these stub
sections, and possibly more than one relocation per section, so
this ought to save a lot of memory by not having to save all
those relocations.


>   Mips_relobj(const std::string& name, Input_file* input_file, off_t offset,
>              const typename elfcpp::Ehdr<size, big_endian>& ehdr)
>     : Sized_relobj_file<size, big_endian>(name, input_file, offset, ehdr),
>       processor_specific_flags_(0), gp_(0), got_info_(NULL)

These data members should be explicitly initialized in the
constructor:

  std::vector<bool> local_symbol_is_mips16_;
  std::vector<bool> local_symbol_is_micromips_;
  std::map<unsigned int, Mips16_stub_section*> mips16_stub_sections_;
  std::set<unsigned int> local_non_16bit_calls_;
  std::set<unsigned int> local_16bit_calls_;
  std::map<unsigned int, Mips16_stub_section*> local_mips16_fn_stubs_;
  std::map<unsigned int, Mips16_stub_section*> local_mips16_call_stubs_;


>   // The master GOT information.
>   Mips_got_info<size, big_endian> *got_info;

This should be named "got_info_".


>   // Secondary GOT fixups.
>   std::vector<Static_reloc> secondary_got_relocs_;

This data member is not initialized in the constructor.

>   // Symbols that have stubs.
>   Unordered_set<Symbol*> symbols_;

Likewise.

>   // The reloc section.
>   Reloc_section* rel_;

Likewise.

There are more -- please check all your constructors. (I'm sure
there are plenty of examples throughout gold where we don't
initialize the private data members in the constructor, but I
think it's good practice, and I try to fix them as I find them.)


> // The ""'s around str ensure str is a string literal, so sizeof works.
> #define strprefix(var, str)   (strncmp(var, str, sizeof("" str "") - 1) == 0)
>
> // Return true if this symbol should be added to the dynamic symbol
> // table.  This function is identical to Symbol::should_add_dynsym_entry.
>
> static bool
> should_add_dynsym_entry(Symbol* sym, Symbol_table* symtab)

Why are you duplicating the code here? If you're counting on
inlining for performance (and you have profile data to show it's
needed), it would probably be better to identify the fast path
and put it inline in the Symbol class definition, calling out to
an out-of-line function for the slow path (and parts that can't
go into the header without including more header files directly
into <symtab.h>).

Otherwise, if someone adds a fix to Symbol::should_add_dynsym_entry,
they're going to need to remember to make the same fix to your copy.
(I have a bug report, in fact, where I think the bug is in this
function!)

[Note to self: Fix this function to use is_prefix_of instead of
strprefix.]



More information about the Binutils mailing list