PowerPC64: why do we need .branch_lt for long branch thunks

Mon Jan 27 00:42:00 GMT 2020

I noticed that PowerPC32 uses

   lis 12, 512       # @ha
   addi 12, 12, 8200 # @l
   mtctr 12
   bctr

for -no-pie long branch thunks (jumping to a non-preemptible symbol with a distance>=0x2000000), and 

   mflr 0
   bcl 20, 31, .+4
   mflr 12
   addis 12, 12, 512 # @ha
   addi 12, 12, -24  # @l
   mtlr 0
   mtctr 12
   bctr

for -pie/-shared long branch thunks. On PowerPC64, why do we use the
"load an address from .branch_lt and jump" approach? Can we do something
similar to PowerPC32 and avoid .branch_lt?

I think a pair of @ha and @l is sufficient for most use cases. It the
offset is beyond [-0x80008000,0x7fff8000), we can add a @highera.
(I don't know when @highesta will be needed but it is straightforward to
add it.)

(https://gcc.gnu.org/onlinedocs/gcc/RS_002f6000-and-PowerPC-Options.html
  does not say how large the text segment can be in the medium code model.)

I am not sure whether loading an address from .branch_lt can be faster
than @highesta+@highera+@ha+@l, but a long branch thunk should never be
in a performance critical code path.