gold: addend issues in icf.cc:get_section_contents

Roland McGrath mcgrathr@google.com
Tue Sep 22 00:01:00 GMT 2015


I have a case where trunk gold crashes.  It's a huge case (linking Google
Chrome) and I don't know off hand how to go about reducing it.  But I've
isolated the crash to some code where things look all kinds of fishy to me.
I can describe in detail how things are going wrong, but I don't really
have any idea what would constitute things going right in this code.

The crash happens inside get_section_contents in icf.cc.  The scenario is
that 'str_contents' (meant to be a pointer into section contents) has been
set to a wildly invalid pointer, and then we get to:

                      // Use the entsize to determine the length.
                      buffer.append(reinterpret_cast<const
                                                     char*>(str_contents),
                                    entsize);

so the crash happens inside memcpy inside std::string::append.

Basically, all of the code inside:

              // This reloc points to a merge section.  Hash the
              // contents of this section.
              if ((secn_flags & elfcpp::SHF_MERGE) != 0
		  && parameters->target().can_icf_inline_merge_sections())
                {

looks kind of fishy to me.

It does various things to calculate 'offset', and then does:

                  section_size_type secn_len;
                  const unsigned char* str_contents =
                  (it_v->first)->section_contents(it_v->second,
                                                  &secn_len,
                                                  false) + offset;

So clearly both 'offset', and 'offset + N' where N is however many bytes
the code actually fetches from 'str_contents' must be within 'secn_len'.
But I see a woeful lack of any code that would give me confidence this
invariant is being maintained.

As a first sanity check, I inserted:
                     gold_assert(offset < (long long) secn_len);
right after the ->section_contents call quoted above.  That alone would
catch my crasher case (more on that below).  But I wanted to do a little
more sanity checking and tried:
                    {
+                     gold_assert(secn_len >= entsize);
+                     gold_assert((long long) (secn_len - entsize) >= offset);
                      // Use the entsize to determine the length.
                      buffer.append(reinterpret_cast<const
                                                     char*>(str_contents),
                                    entsize);
                    }
The two gold_assert's there are my additions.  However, the second assert
fires in cases other than my crasher.  Unless I'm thoroughly confused,
those cases are wrong too even though they don't crash.  It should never be
OK to access past str_contents-offset+secn_len, right?

My case is an i386 link (on an x86_64 host), so addresses and addends and
everything are 32-bit quantities.  In the case that crashes, the value of
'reloc_addend_value' extracted from the section contents is 0xfffffffc.
This should be interpreted as -4, but:
                          reloc_addend_value =
                            read_from_pointer<32>(reloc_addend_ptr);
extracts it as 32 bits and then zero-extends it to 64 bits.  Naively, I
want to just sign-extend here.  But that doesn't actually make any sense in
the context of how this value is used.  Nearby, for the RELA case we have:

                  unsigned long long addend = it_a->second;
                  // Ignoring the addend when it is a negative value.  See the
                  // comments in Merged_symbol_value::Value in object.h.
                  if (addend < 0xffffff00)
                    offset = offset + addend;

I had trouble finding the comment it refers to until I realized that it's
actually in Merged_symbol_value::value (lower case), the method, not
Merged_symbol_value::Value (capitalized), which is a type.  I can almost
follow the logic in that comment, though not quite.  But anyway, it seems
pretty clear that the same logic for treating addend values should apply in
both RELA and REL cases.  So I change:
+                 if (reloc_addend_value < 0xffffff00)
                    offset = offset + reloc_addend_value;
and at least the crash goes away.  But I have no idea if things are
actually coming out correct now.

If that is the "correct" fix, then I think it would certainly be cleaner to
separate this logic from the collection of the addend value.  Collecting
the addend value is done differently for RELA vs REL, but the logic applied
to that value is the same either way.  But I don't know enough of the
context here to be confident about how to refactor that correctly.  (Is
the '*it_addend_size == 0' case for RELA, or something else?  Is there ever
a case when 'addend' (aka 'it_a->second') is nonzero and '*it_addend_size'
is also nonzero?)

To what little extent I understand the context here, I think things being
wrong in this code is most likely to just cause ICF to miss a folding
opportunity.  So it might not really be noticed otherwise.  This makes me
think this whole function needs a careful audit by someone who thoroughly
understands its context (which I do not).

What is most troubling to me is that there appears to be no input
validation being done here at all.  Even if an addend value is not huge
(e.g. because it's really negative, as in my crasher) then you can't just
blithely assume that it's a valid offset within the section from which to
read 'entsize' bytes of data (or worse, a null-terminated string for the
SHF_STRINGS case).

Is there someone who groks this code and can tackle this mess?


Thanks,
Roland



More information about the Binutils mailing list