[Mips}Using DT tags for handling local ifuncs

Tue Dec 24 09:02:00 GMT 2013

Jack Carter <Jack.Carter@imgtec.com> writes:
>>>> But it was an either-or choice. :-)  Does it include entry 0 or not?
>>>> If yes, it's B.  If no, it's A.
>>>>
>>>>> here is the quote from the pre-sgi System V
>>>>> Application Binary Interface Mips Processor Supplement:
>>>>>
>>>>> Global Offset Table (5-9, second paragraph)
>>>>> "The global offset tables split into two locally separate subtables:
>>>>> local and
>>>>> externals. Local entries reside in the first part of the global offset
>>>>> table. The
>>>>> value of the dynamic tag DT_MIPS_LOCAL_GOTNO holds the number of
>>>>> local global offset table entries."
>>>>
>>>> To me this suggests B if taken at face value.
>>>
>>> No, the reality is that there should be a pointer to the beginning of
>>> the local got region and DT_MIPS_LOCAL_GOTNO represent its size.
>> 
>> Well, for delimiting an area we can either use "start and size" or
>> "start and end".  Since DT_MIPS_LOCAL_GOTNO is effectively the end
>> of the local area -- despite the "NO" -- we can keep backward
>> compatibility by seeing it as an end rather than a size.
>> 
>> But I agree completely about having an explicit start for the local area.
>> That's what the new tag I was suggesting was.  So going back to the new
>> GOT region, I was really thinking about the current:
>> 
>>     +------------------+
>>     | reserved entries |
>>     +------------------+
>>     |   local entries  |
>>     +------------------+  <-- T2
>>     |  global entries  |
>>     +------------------+
>> 
>> becoming (with a new name for the new region):
>> 
>>     +------------------+
>>     | reserved entries |
>>     +------------------+
>>     | general GOT data |
>>     +------------------+  <-- T1
>>     |   local entries  |
>>     +------------------+  <-- T2
>>     |  global entries  |
>>     +------------------+
>> 
>> It's entirely up to the static linker what goes in the new region.
>> In our case it would be R_MIPS_IRELATIVE-relocated entries, but it could
>> be anything really (including .lit4, .lit8, or whatever).  I.e. this region
>> would be handled like GOTs are on other targets.
>> 
>> T2 is currently called DT_MIPS_LOCAL_GOTNO, but if we had:
>> 
>> T1: DT_MIPS_LOCAL_GOTIDX
>> T2: DT_MIPS_GLOBAL_GOTIDX
>> 
>> with:
>> 
>> #define DT_MIPS_GLOBAL_GOTIDX DT_MIPS_LOCAL_GOTNO
>> 
>> then would it be more acceptable namewise?  We could throw in a GOTIDX
>> tag for the new region too for completeness.
>> 
>
> I like this description

OK, thanks, sound like a plan then.

>>>>> For entertainment sake here is the comment in my private elf dumper
> wrote back then:
>>>>>
>>>>> /**
>>>>>       @internal
>>>>>
>>>>>       Function:	mips_print_got
>>>>>
>>>>>       MIPS has 2 different GOT table variants that are
>>>>>       pretty much the same except one depends on symbol
>>>>>       table to got table symmetry for runtime fixup purposes
>>>>>       and the other uses runtime relocations.
>>>>>       
>>>>> If there is multigot there will be entries in the first dynamic
> section
>>>>>       of type DT_MIPS_AUX_DYNAMIC which point to the other
>>>>>       dynamic sections which in turn point to and describe their
>>>>>       associated gots.
>>>>>       
>>>>>       DT_MIPS_LOCAL_GOTNO     	Starting point for DEFAULT symbols
>>>>> DT_MIPS_GOTSYM Index into dsymtab matching DT_MIPS_LOCAL_GOTNO
>>>>>       DT_MIPS_HIPAGENO		Number of page table entries.
>>>>> DT_MIPS_LOCALPAGE_GOTIDX Starting point for a local got page table
>>>>>       DT_MIPS_LOCAL_GOTIDX    	Starting point for local full addresses
>>>>>       DT_MIPS_HIDDEN_GOTIDX   	Starting point for HIDDEN symbols
>>>>>       DT_MIPS_PROTECTED_GOTIDX	Starting point for PROTECTED symbols
>>>>>
>>>>>       If DT_MIPS_LOCAL_GOTIDX == DT_HIDDEN_GOT_IDX ||
>>>>>       	    	    	       DT_PROTECTED_GOT_IDX ||
>>>>> 			       DT_MIPS_LOCAL_GOTNO
>>>>>       then there are no local entries. Local in this sense
>> If we did have multiple .rel.dyn sections, then:
>>>>>       means addresses that may or may not have associated
>>>>>       entries in the symbol table or relocation table. If
>>>>>       they are present in the symbol table they will be marked
>>>>>       as STO_INTERNAL and must not be referenced outside of the
>>>>>       defining dso/a.out in any form.
>>>>>
>>>>>       If DT_HIDDEN_GOT_IDX == DT_PROTECTED_GOT_IDX ||
>>>>>       	    	    	    DT_MIPS_LOCAL_GOTNO
>>>>>       then there are no hidden entries. Hidden symbols
>>>>>       are those that are marked STO_HIDDEN in the dynamic
>>>>>       symbol table and are accessable from outside the defining
>>>>>       dso only non-symbolicly such as through pointers.
>>>>>
>>>>>
>>>>>       If DT_PROTECTED_GOT_IDX == DT_MIPS_LOCAL_GOTNO
>>>>>       then there are no protected entries. Protected symbols
>>>>>       are those that are marked STO_PROTECTED in the dynamic
>>>>>       symbol table and are accessable from the outside, but
>>>>>       cannot be preempted during runtime loading and thus are
>>>>>       "protected".
>>>>>       
>>>>>       @return  void.
>>>>>    */
>>>>>
>>>>> Note, for multigot this resulted in multiple dynamic sections, dynsyms and
>>>>> relocation fixups for the got entries.
>>>>
>>>> Did it also result in multiple relocation tables, one for each .dynamic
>>>> section?  Or was there still a single .rel.dyn table?
>>>>
>>>> If just a single .rel.dyn table, did all relocations in the table use
>>>> the primary GOT's DT_MIPS_GOTSYM as the local/global threshold?  If so,
>>>> did that mean that there was no specific limit to the number of distinct
>>>> global symbols that could be stored in GOT entries (thanks to multigot),
>>>> but that there was a limit of 16k (or 8k for n64) global symbols that
>>>> could be used in relocations?  (Sorry for the barrage of questions --
>>>> the downside of doing this by email.)
>>>>
>>>
>>> I may not understand the question, but will try to answer.
>>> Let's pretend we had a case where the linker broke up a dso it was making
>>> into having 3 gp-relative regions (multigot). Each region would have its own
>>> .dynamic table pointing to its own unique dynsym, got, sdata, sbss, etc. By
>>> basic ELF format definition, if any of these sections need relocations they
>>> will have their own unique relocation sections.
>>>
>>> I know, .dynrel has sort of stretched this defintion, but we keep to
>>> the current rule by having the dynamic table for the individual got
>>> describe where its relocations are and how they are distributed.
>>>
>>> The limit on symbol indexes is preserved because we are only looking at
>>> a sub-region.
>>>
>>> I guess the key is that each got/gp-rel region has its own individual
>>> .dynsym that describes its microcosm independent of the others. The
>>> main .dynamic section points to all the extra .dynamic sections
>>> through DT tags.
>> 
>> I was more thinking about a DSO containing something like:
>> 
>> 	.data
>> 	.macro	doit
>> 	.word	foo\@
>> 	.endm
>> 	.rept	20000
>> 	doit
>> 	.endr
>> 
>> i.e.:
>> 
>> 	.data
>> 	.word   foo0
>>          ...
>> 	.word   foo19999
>> 
>> where we have 20000 R_MIPS_REL32s against various foos and therefore need
>> 20000 GOT entries.
>> 
>> Is this allowed on its own, without explicit GOT references to the foos?
>> If it is allowed, do you create 2 GOTs to handle it, so that each GOT is
>> still within the 64k limit?  If so, do the two .dynamic sections both
>> have their own .rel.dyn sections, each containing the R_MIPS_REL32s for
>> the symbols in the associated GOT?
>
> Based on my stated view of the world, which may well be skewed: These
> symbols would not end up in the gp-relative GOT. If there is a table
> you need to look them up on, it would be different from the
> gp_relative one and not clutter the gp-rel equation.
>
> That said, I have no idea how sgi dealt with the above. I should, but
> don't remember.  possibly they stuffed them into the GOT as well. I
> know we didn't have 2 main GOTs so I would have to expect they took up
> space in the one GOT (sounds religious).

I think it's a fairly fundamental point though.  If you're saying that
every GOT needs to be <= 64k in size then you either need to have
two .dynamics, two GOTs, two .dynsyms and two .rel.dyns for the above,
or you need to redefine R_MIPS_REL32 so that it isn't tied to
DT_MIPS_GOTSYM (and hence to the GOT).  The former seems unnecessarily
complicated and the latter loses a nice optimisation.

> It may well be that they didn't worry about the extra entries because
> the compiler would just produce gp-relative references for externs.

Pointers like the above occur in things like initialisers and vtables
though.  I.e. the .word example was an example of static data rather
than symbols being accessed by instructions.  The compiler has no chance
to make them gp-relative, at least not unless it generates .init code to
initialise them at runtime.

>> Does the answer change if, in addition to the above, there are also
>> explicit GOT references to each foo, as in:
>> 
>> a.s:
>>          lw	$4,%got(foo0)($gp)
>>          ...
>>          lw	$4,%got(foo9999)($gp)
>> 
>> b.s:
>>          lw	$4,%got(foo10000)($gp)
>>          ...
>>          lw	$4,%got(foo19999)($gp)
>> 
>> so that the symbols fall naturally into two GOTs?  Would b.s's GOT
>> then have the .data relocations for foo10000 and above and a.s's GOT
>> have the .data relocations for the rest?
>
> The static linker should only use one GOT entry for a given symbol for the
> case where we don't have multi-got. There is no need to duplicate.

But you can't have 20000 symbols in a single 64k GOT, so linking a.s and b.s
would force multiple GOTs.  The question then was whether the R_MIPS_REL32s
for the .words would be split over two .rel.dyn sections, with the
R_MIPS_REL32s in each .rel.dyn section using the DT_MIPS_GOTSYM and GOT
from their respective .dynamic sections.

>> If we did have multiple .rel.dyn sections, then:
>> 
>
> But if we did, yes, this would be a mess with the current
> setup. Instead of multiple .dynrel sections, I would deal with it with
> DT tags and use the same .dynrel section.
>
> This would need more thought.

But you could you guess how it worked with the original SGI scheme?

Defining more tags for this seems unnecessarily complex to me.
Why not just let the primary GOT continue to grow beyond 64k,
like it does now?  With the new region I'm suggesting it shouldn't
matter where the GOT ends, since you can put things like .sdata,
.sbss, .lit4 and .lit8 in the new region instead.

>>>> If there were multiple .rel.dyn tables, each tied to their own
>>>> .dynamic sections, how would we sort them so that all IRELATIVE
>>>> relocations in am object are applied after all non-IRELATIVE ones?
>> 
>> ...this would become a concern.
>
> Actuall after some thought, it happens in the same order as with a single GOT.
> Go through everything in the same order as now. If IFUNC is last then each
> gp-regions GOT would get processed at the same time after all the other info
> is processed.

It isn't a question of when GOT ifunc entries would be processed but when
ifunc addresses elsewhere would be processed.  Things like .word references
in .data, as above.  There would need to be a dynamic relocation against
the .word.  And if there are multiple .rel.dyn sections, the question is
how to order things so that all the ifunc relocations get processed last.

Thanks,
Richard