i386 relocation, _GLOBAL_OFFSET_TABLE_ -fPIC
John Reiser
jreiser@BitWagon.com
Mon Jun 11 11:05:00 GMT 2001
Hi,
For position-independent code on i386 (gcc -fPIC), the compiler
convention is to have register %ebx contain &_GLOBAL_OFFSET_TABLE_,
with ld building the GOT. It seems to me that a new relocation type
or two would enable smaller, faster code: 6 or 7 bytes per subroutine.
Current code uses
call L10
L10:
pop %ebx
add $_GLOBAL_OFFSET_TABLE_ - L10, %ebx
or for i686,
call L10
add $_GLOBAL_OFFSET_TABLE_ - L10, %ebx
with an out-of-line
L10:
mov (%esp),%ebx
ret
Why not just
call put_GLOBAL_OFFSET_TABLE_into_ebx # pc-relative
where that target is built at load time, as
put_GLOBAL_OFFSET_TABLE_into_ebx:
movl $_GLOBAL_OFFSET_TABLE_, %ebx
ret
The loader ld can put the subroutine into the GOT itself.
There is no performance penalty as long as code and data
are segregated onto separate 32-byte cache lines. If the
PT_LOAD for data does not have PROT_EXEC, or if you insist
that all instructions be free from PROT_WRITE, then put the
generated subroutine at the end of the PT_LOAD for text
(on its own cache line), and incur a one-page penalty
for the relocation of a text page. If even that is not
allowed, then revert to the current scheme.
There are other tricks, too, that could save 10%-15% in time,
and make %ebx available for general uses, even in [semi-] -fPIC.
So, what is the climate for new R_386_* relocation types?
--
John Reiser, jreiser@BitWagon.com
More information about the Binutils
mailing list