[Mips}Using DT tags for handling local ifuncs
Richard Sandiford
rdsandiford@googlemail.com
Thu Dec 12 09:47:00 GMT 2013
"Maciej W. Rozycki" <macro@codesourcery.com> writes:
>> Unless this last argument below can convince you or at least give you pause
>> to consider "implicit, explicit, default" ordering, I will start on
>> "explicit, implicit,
>> default".
>>
>> I don't think it is the "right" thing to do, but what the heck, what
>> is right and wrong
>> anyway? What you are proposing should be workable. I just need to get
>> it correctly
>> written up before shipping.
>
> Regrettably I still haven't had the time to absorb all the details of
> this stuff, but I think I've ingested enough to ask one question: given
> that explicit dynamic relocs will be used anyway, does this new chunk of
> run-time relocatable data have to be a part of the GOT as defined by the
> traditional SVR4 MIPS psABI in the first place? How about we leave the
> current definitions of the DT_PLTGOT and DT_MIPS_LOCAL_GOTNO dynamic tags
> and the .got special section intact?
>
> I gather all that is needed is that ifunc pointers are reachable with
> gp-relative addressing (so that the same standard calling sequence can be
> used, either the SVR4 PIC or the non-PIC PLT type, regardless of whether
> calling an ifunc or an ordinary function), so grouping them in a section
> called .igot.plt and then either prepending or appending to .got should
> do; with a linker script even. Of course the static linker will have to
> ensure that all the pointers in the combined sections are in range from
> $gp (and the same with secondary $gp values in the multi-GOT case).
I don't follow the comment about calling convention, sorry. The problem
here is what to do with:
lw $4,%got_disp(foo)($28)
in cases where foo is an ifunc that binds locally. We need some way
of putting it in the GOT and having an IRELATIVE relocation against it.
I think you're suggesting that we allow the ABI-defined GOT to start at
something other than $gp - 0x7ff0, so that explicitly-relocated data
could go first. I think that would be more disruptive in some ways,
since the 0x7ff0 offset is hard-coded into glibc. The resolver for
lazy-binding stubs subtracts 0x7ff0 from the incoming $gp to get the
start of the ABI-defined GOT and then gets the link map from entry 1
(assuming that the GNU extension is in use).
I suppose it'd be possible to adjust $gp in the stub so that $gp - 0x7ff0
is right on entry to the resolver. But that would be difficult to do
cleanly on n32 and n64, where $gp is call-saved. The resolver would
probably have to return to the stub, which in turn would mean that the
stub would need call-frame information.
> BTW, for loading 64-bit addresses I suggest using two temporaries (we've
> got plenty of them) for a sequence that is faster on superscalar
> processors, i.e. rather than:
>
> static const bfd_vma mips64_exec_iplt_entry[] =
> {
> 0x3c0f0000, /* lui $15, %highest(.got.iplt entry) */
> 0x65ef0000, /* daddiu $15, $15, %higher(.got.iplt entry) */
> 0x000f7c38, /* dsll $15,$15, 16 */
> 0x65ef0000, /* daddiu $15, $15, %hi(.got.iplt entry) */
> 0x000f7c38, /* dsll $15,$15, 16 */
> 0x01f90000, /* l[wd] $25, %lo(.got.iplt entry)($15) */
> 0x03200008, /* jr $25 */
> 0x00000000, /* nop */
> };
>
> use:
>
> static const bfd_vma mips64_exec_iplt_entry[] =
> {
> 0x3c0f0000, /* lui $15, %highest(.got.iplt entry) */
> 0x3c0e0000, /* lui $14, %hi(.got.iplt entry) */
> 0x25ef0000, /* addiu $15, $15, %higher(.got.iplt entry) */
> 0x000f783c, /* dsll32 $15, $15, 0x0 */
> 0x01ee782d, /* daddu $15, $15, $14 */
> 0xddf90000, /* ld $25, %lo(.got.iplt entry)($15) */
> 0x03200008, /* jr $25 */
> 0x00000000, /* nop */
> };
>
> (this also avoids a DADDIU erratum early R4000/R4400 chips had).
Yeah, I wondered about this when I first saw it too, but Jack optimized
the sequence based on the address, so that we would only have the full
thing if %highest really was needed. Since the usual base address is
0x120000000, I think the full sequence will in effect never be used.
I'm not opposed to having two n64 sequences, one for when %highest
is needed and one for when it isn't. It just doesn't seem like a
priority.
Thanks,
Richard
More information about the Binutils
mailing list