[x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX

Thu Jul 25 17:24:00 GMT 2013

On Wed, Jul 24, 2013 at 5:23 PM, Ian Lance Taylor <iant@google.com> wrote:
>> * The foo@plt pseudo-symbols that e.g. objdump will display are based on
>>   the BFD backend knowing the size of PLT entries.  Arguably this ought
>>   to look at sh_entsize of .plt instead of using baked-in knowledge, but
>>   it doesn't.
>
> This seems fixable.  Of course, we could also keep the PLT the same
> length by changing it.  The current PLT entries are
>
>     jmpq *GOT(sym)
>     pushq offset
>     jmpq plt0
>
> The linker or dynamic linker initializes *GOT(sym) to point to the
> second instruction in this sequence.  So we can keep the PLT at 16
> bytes by simply changing it to jump somewhere else.
>
>     bnd jmpq *GOT(sym)
>     .skip 9
>
> We have the linker or dynamic linker fill in *GOT(sym) to point to the
> second PLT table.  When the dynamic linker is involved, we use another
> DT tag to point to the second PLT.  The offsets are consistent: there
> is one entry in each PLT table, so the dynamic linker can compute the
> right value.  Then in the second PLT we have the sequence
>
>     pushq offset
>     bnd jmpq plt0
>
> That gives the dynamic linker the offset that it needs to update
> *GOT(sym) to point to the runtime symbol value.  So we get slightly
> worse instruction cache handling the first time a function is called,
> but after that we are the same as before.  And PLT entries are the
> same size as always so everything is simpler.
>
> The special DT tag will tell the dynamic linker to apply the special
> processing.  No attribute is needed to change behaviour.  The issue
> then is: a program linked in this way will not work with an old
> dynamic linker, because the old dynamic linker will not initialize
> GOT(sym) to the right value.  That is a problem for any scheme, so I
> think that is OK.  But if that is a concern, we could actually handle
> by generating two PLTs.  One conventional PLT, and another as I just
> outlined.  The linker branches to the new PLT, and initializes
> GOT(sym) to point to the old PLT.  The dynamic linker spots this
> because it recognizes the new DT tags, and cunningly rewrites the GOT
> to point to the new PLT.  Cost is an extra jump the first time a
> function is called when using the old dynamic linker.
>

I don't like the complexity.  I believe extending PLT entry to
32 byte works with the old ld.so.  If we are willing to have
mixed PLT entry, we merge 2 16-byte PLT entries into one
super 32-byte PLT entry so that we can have

jmpq   *name@GOTPCREL(%rip)
pushq  $index
jmpq   PLT0
bnd jmpq   *name@GOTPCREL(%rip)
pushq  $index
bnd jmpq   PLT0
nop paddings
jmpq   *name@GOTPCREL(%rip)
pushq  $index
jmpq   PLT0

We can also have new link-time relocations for branches with BND
prefix and only create the super PLT entries when needed.  Of course,.
unwind info may be incorrect for both approach if we don't find a way
to fix it.

--
H.J.