i386 relocation, _GLOBAL_OFFSET_TABLE_ -fPIC

John Reiser jreiser@BitWagon.com
Mon Jun 11 11:05:00 GMT 2001


Hi,

For position-independent code on i386 (gcc -fPIC), the compiler
convention is to have register %ebx contain &_GLOBAL_OFFSET_TABLE_,
with ld building the GOT.  It seems to me that a new relocation type
or two would enable smaller, faster code: 6 or 7 bytes per subroutine.

Current code uses
	call L10
L10:
	pop %ebx
	add $_GLOBAL_OFFSET_TABLE_ - L10, %ebx

or for i686,
	call L10
	add $_GLOBAL_OFFSET_TABLE_ - L10, %ebx

with an out-of-line
L10:
	mov (%esp),%ebx
	ret


Why not just
	call put_GLOBAL_OFFSET_TABLE_into_ebx  # pc-relative

where that target is built at load time, as

put_GLOBAL_OFFSET_TABLE_into_ebx:
	movl $_GLOBAL_OFFSET_TABLE_, %ebx
	ret

The loader ld can put the subroutine into the GOT itself.
There is no performance penalty as long as code and data
are segregated onto separate 32-byte cache lines. If the
PT_LOAD for data does not have PROT_EXEC, or if you insist
that all instructions be free from PROT_WRITE, then put the
generated subroutine at the end of the PT_LOAD for text
(on its own cache line), and incur a one-page penalty
for the relocation of a text page.  If even that is not
allowed, then revert to the current scheme.

There are other tricks, too, that could save 10%-15% in time,
and make %ebx available for general uses, even in [semi-] -fPIC.
So, what is the climate for new R_386_* relocation types?

-- 
John Reiser, jreiser@BitWagon.com



More information about the Binutils mailing list