[PATCH] BFD: Prevent writing the MIPS _gp_disp symbol into symbol tables

Maciej W. Rozycki macro@mips.com
Tue Feb 27 14:20:00 GMT 2018

Hi Simon,

 Apologies for the delay, I have been unloading the pile of various issues 
discovered in the course of verifying the recent binutils 2.30 release.

> >  The thing is we have been doing this since forever and there were no
> > issues so far, so obviously anything you observe must be a corner case.
> > E.g. I've had a quick look at a shared library I built back in 2001 and it
> > does have `_gp_disp' in its dynamic symbol table, as an absolute symbol
> > set to the canonical gp value.  And it is no different with binaries built
> > nowadays.  So before we move forward, can you please post an actual test
> > case which reproduces the problem?
> Here is a reproduction script for the problem:
> $ cat test.s
>   .global foo
>   .text
> foo:
>   lui    $t0, %hi(_gp_disp)
>   addi   $t0, $t0, %lo(_gp_disp)
> $ cat test.ver
> LLD_1.0.0 { global: foo; };
> $ mips-mti-linux-gnu-as test.s -o test.o
> $ mips-mti-linux-gnu-ld -shared test.o -o libtest.so --version-script test.ver
> $ mips-mti-linux-gnu-readelf -sV libtest.so
> Symbol table '.dynsym' contains 11 entries:
>    Num:    Value  Size Type    Bind   Vis      Ndx Name
>      0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
>      1: 00000370     0 SECTION LOCAL  DEFAULT    9
>      2: 00010380     0 NOTYPE  GLOBAL DEFAULT   10 _fdata
>      3: 00018370     0 SECTION GLOBAL DEFAULT  ABS _gp_disp
> [...]
> Version symbols section '.gnu.version' contains 11 entries:
>  Addr: 0000000000000318  Offset: 0x000318  Link: 5 (.dynsym)
>   000:   0 (*local*)       0 (*local*)       1 (*global*)      0 (*local*)
> [...]
> Please note that the .gnu.version's entry related to the _gp_disp
> symbol contains zero. Zero means "The symbol is local, not available
> outside the object", while _gp_disp has GLOBAL binding. LLD does not
> like that difference.

 Thank you for the test case.  I think it will be good if it is included 
with the commit as a part of the LD test suite.  I think the test should 
negatively match the presence of `_gp_disp' in the output of `readelf -s'.  

 Will you be able to work on this as an exercise or shall I look into it 

> Probably the better solution would be to fix _gp_disp entry in the
> .gnu.version and write 1 there. From the other side, I thought that
> removing the _gp_disp from symbols table was not too bad idea because
> this symbol is useless in a linked file.
> I can implement both fixes or one of them.

 The MIPS psABI is clear on `_gp_disp' use[1]:

"The symbol name _gp_disp is reserved.  Only R_MIPS_HI16 and R_MIPS_LO16 
relocations are permitted with _gp_disp.  These relocation entries must 
appear consecutively in the relocation section and they must reference 
consecutive relocation area addresses."

so obviously a dynamic symbol is useless as you cannot refer to one with 
R_MIPS_HI16 or R_MIPS_LO16 and you cannot refer to a symbol other than 
with a relocation.

 I was concerned about possible IRIX use as IRIX seems to often diverge 
slightly from the published psABI.  However as it turns out the issue of 
`_gp_disp' being unnecessarily exported has been already discussed: 
<https://sourceware.org/ml/binutils/2005-07/msg00496.html> and that has 
cleared my concern.  A suggestion has also been made in the course of that 
discussion to discard this symbol in the final link.

 I also considered whether the symbol could be used for anything directly 
by the user in GDB.  E.g. in theory, observing the symbol being absolute:

(gdb) print &_gp - &_gp_disp

should calculate the base address.  However `_gp_disp' due to being 
defined, oddly enough, as a section symbol, is not exported as a 
user-accessible expression by GDB.  So even such a marginal use case is 
not possible.

 So I think your change is good to go in once the technical details, 
discussed below, have been sorted out and you'll have confirmed that the 
copyright assignment has as well.  Please note that may have to work it 
around in LLD as well, to cope with existing binaries.

 For the record this special handling for `_gp_disp' has been added with:

commit 5b3b9ff61d3a2a6cdd90fcec1a61d38699f3c608
Author: Ian Lance Taylor <ian@airs.com>
Date:   Thu Jan 11 21:06:42 1996 +0000

	* elf32-mips.c: Extensive changes for a start at dynamic linking
	support, from Kazumoto Kojima <kkojima@info.kanagawa-u.ac.jp>.

which for practical purposes means it's been there since forever.

 Detailed change review follows.

> BFD: Prevent writing the MIPS _gp_disp symbol into symbol tables

 Don't repeat the change heading in the description.

> The _gp_disp is a magic symbol, always implicitly defined by the linker.
> It does not make a sense to write it into symbol tables for output files.
> Moreover, now if the linker gets a version script, the _gp_disp symbol
> gets zero version definition index. This symbol is global while the zero
> index means unversioned local symbol. That confuses some tools like for
> example the LLD linker when they get such files as inputs.

 Please include a quote from the relevant part of the ELF gABI in your 
updated description to back up the zero index interpretation, so that 
people do not have to chase it.

 Also please use two spaces after full stops as per the GNU Coding 

"Please put two spaces after the end of a sentence in your comments, so 
that the Emacs sentence commands will work."

> 2018-01-25  Simon Atanasyan  <simon.atanasyan@mips.com>

 You have:

From: Simon Atanasyan <simon@atanasyan.com>

in the headers however, so which authorship attribution is right for this 

 Whatever is used in the `From:' mail header will be used for authorship 
attribution in the commit.  If you want to use a different one, then 
please use the first line of the change description (e-mail body) to 
specify the correct attribution.  The syntax is the same as with the 
`From:' mail header, e.g.:

From: Simon Atanasyan <simon.atanasyan@mips.com>

GIT handles this automatically.  If unsure, then try feeding your composed 
e-mail to upstream GIT head with `git am' and see if the result is as you 
expect before posting.  The address you are going to record the authorship 
with has to match the copyright assignment.

> bfd/
> 	* elf32-mips.c: (elf32_mips_fixup_symbol ): New function.

 No space before the closing parenthesis.

> 	(elf_backend_fixup_symbol): New macro.
> 	* elfxx-mips.c: (_bfd_mips_elf_link_output_symbol_hook): Discard
> 	the _gp_disp symbol in case of O32 ABI.
> testsuite/ld
> 	* ld-mips-elf/mips16-pic-2.ad: Update test case expectations.

 We don't use separate ChangeLog files for test suites anymore (and BTW 
the path is ld/testsuite/), so please update accordingly:

	* testsuite/ld-mips-elf/mips16-pic-2.ad: [...]

Likewise throughout.

 Also his will land in a separate ChangeLog file, so please be more 
specific in referring to the BFD change, e.g.:

	* testsuite/ld-mips-elf/mips16-pic-2.ad: Update for _gp_disp 
	symbol removal.

will do.

> diff --git a/bfd/elf32-mips.c b/bfd/elf32-mips.c
> index fa0cc15aba..5f5b08136f 100644
> --- a/bfd/elf32-mips.c
> +++ b/bfd/elf32-mips.c
> @@ -2414,6 +2414,22 @@ elf32_mips_write_core_note (bfd *abfd, char *buf, int
> *bufsiz, int note_type,
>        }
>      }
>  }
> +
> +/* Remove magic _gp_disp symbol from the dynamic symbol table.  */
> +
> +static bfd_boolean
> +elf32_mips_fixup_symbol (struct bfd_link_info *info,
> +			    struct elf_link_hash_entry *h)
> +{
> +  const char *name = h->root.root.string;

 Please add a new line after the variable declaration block.

> +  if (h->dynindx != -1 && strcmp(name, "_gp_disp") == 0)

 Space before the opening parenthesis in a function call.

> +    {
> +      h->dynindx = -1;
> +      _bfd_elf_strtab_delref (elf_hash_table (info)->dynstr,
> +			      h->dynstr_index);
> +    }
> +  return TRUE;
> +}

 Why do you actually need this part in addition to the other one?  It 
looks to me like the change to the `elf_backend_link_output_symbol_hook' 
handler should do by itself what you require.

> @@ -2598,6 +2614,9 @@ static const struct ecoff_debug_swap
> mips_elf32_ecoff_debug_swap = {
>  #undef elf_backend_write_core_note
>  #define elf_backend_write_core_note	elf32_mips_write_core_note
> +#undef elf_backend_fixup_symbol
> +#define elf_backend_fixup_symbol elf32_mips_fixup_symbol

 Please use a tab to align the RHS of the definition to column 40, as with 
other definitions.

 Also (assuming that we do want to keep it), as per the discussion above, 
we want it across all targets, so group the definition with ones ahead of 
the first "elf32-target.h" inclusion.  Do not separate the definition from 
the other ones with a new line; group them all together as elsewhere in 
this file.

> @@ -2620,6 +2639,7 @@ static const struct ecoff_debug_swap
> mips_elf32_ecoff_debug_swap = {
>  #define elf32_bed				elf32_fbsd_tradbed
>  #undef elf_backend_write_core_note
> +#undef elf_backend_fixup_symbol

 This has to go naturally then.

> diff --git a/bfd/elfxx-mips.c b/bfd/elfxx-mips.c
> index 285401367d..f387ca47db 100644
> --- a/bfd/elfxx-mips.c
> +++ b/bfd/elfxx-mips.c
> @@ -7741,9 +7741,16 @@ _bfd_mips_elf_add_symbol_hook (bfd *abfd, struct
> bfd_link_info *info,
>  int
>  _bfd_mips_elf_link_output_symbol_hook
>    (struct bfd_link_info *info ATTRIBUTE_UNUSED,
> -   const char *name ATTRIBUTE_UNUSED, Elf_Internal_Sym *sym,
> -   asection *input_sec, struct elf_link_hash_entry *h ATTRIBUTE_UNUSED)
> +   const char *name, Elf_Internal_Sym *sym, asection *input_sec,
> +   struct elf_link_hash_entry *h ATTRIBUTE_UNUSED)
>  {
> +  /* Discard the _gp_disp symbol.  */
> +  if (! NEWABI_P (info->output_bfd)
> +      && ! bfd_link_relocatable (info)
> +      && name
> +      && strcmp (name, "_gp_disp") == 0)
> +    return 2;

 Please don't update this handler like this.  Instead define a new handler 
in bfd/elf32-mips.c and tail-call `_bfd_mips_elf_link_output_symbol_hook' 
after the newly added case has been processed.

 Also since we will not be outputting `_gp_disp' anymore, please discard 
code in `mips_elf_output_extsym' and `_bfd_mips_elf_finish_dynamic_symbol' 
that arranges for this symbol to become section/absolute in the output 
symbol table as this will now become dead code.

 Also you didn't write how your change was tested.  Please always include 
such information with patch submissions.  This will typically require 
running regression tests at the very least.

 Please resubmit with the issues discussed above addressed.



    Supplement, 3rd Edition", Section "Relocation Types", pp. 4-20

[2] "The GNU Coding Standards", Section 5.2 "Commenting Your Work"

More information about the Binutils mailing list