vtrelocs: large/modular C++ app speedup ...

Michael Meeks michael.meeks@novell.com
Wed Apr 2 11:31:00 GMT 2008

Hi guys,

	I spent a little time recently researching ways to reduce the number of
unique named relocations that must be processed at dlopen time for large
C++ libraries[1]. Apologies for spamming all 3 lists like this, but it
touches all 3 projects.

	Since almost all function relocations of this type are inside vtables,
I implemented a new way of relocating vtables. This is a new
'.suse.vtrelocs' section.

	As we inherit a class across a shared library boundary we construct new
vtables that are often extremely similar to their parents. However -
this similarity is not exposed - instead we fill the new vtable with
many unique named relocations, one per method. This generates lots
of .rel entries, and emits lots of external symbols; worse these symbols
tend to be duplicated across ~all libraries deriving from the base

	Instead a vtreloc sections contains (a sorted):

struct {
	void **src, **dest;
	int  copy_slot_bitmask;
} vtreloc_entries[] = { ... }

	The run-time cost of processing these is insignificant in comparison to
the cost of processing the remaining relocations, giving a pleasant
speed win.

	A brief slide-deck with the results of my research is here:


	and has a comparison against the current state of the art wrt. reducing
relocations: -Bsymbolic-functions [ in itself a substantial
optimisation ].

	The 3 prototype patches for discussion are attached. There are a number
of trivial hacks in there (of course) - eg. environment variables to
turn the feature on, leaving an empty .vtrelocs section in object files

	The more interesting problems are:

	* glibc - the memory protection semantics need adjusting - since
	  we need to fixup relocations in 'init' order: shouldn't be
	  impossibly hard to fix but I just turn off protection ;-)
		+ subsequent dlopens can (I think) avoid touching
		  already relocated libraries they don't own avoiding 
		  this sort of problem.

	* gcc - the code to generate the vtreloc sections is <cough> 
	  written for comfort not speed. This is a fall-back from having
	  initially tried to integrate the work into 
	  build_vtbl_initializer & friends with some success, but rather
	  a tangling of the code.

	* vtreloc section design - the section should be readonly, and 
	  prolly refer by offset to .bss relocations that can be re-used
	  for implementing indirect calls via. parent vtable to virtual
	  functions. That should save relocs, but make each entry 
	  slightly larger.

	Of course, apart from the run-time speed wins, some of the nicest
potential size wins come from breaking the ABI[2] & depending on the
vtrelocs to fixup vtables: eg. hiding all thunks (implemented), or
potentially hiding all virtual function symbols & invoking them via
their parent vtable (not implemented).

	Wrt. testing, I can build & run an OO.o built with this - clearly not a
unit-test ;-) but perhaps helpful.

	Feedback much appreciated,



[1] - specifically OpenOffice.org ;-)
[2] - which while bad, can be done in isolated islands like OO.o.
 michael.meeks@novell.com  <><, Pseudo Engineer, itinerant idiot

-------------- next part --------------
A non-text attachment was scrubbed...
Name: suse-vtrelocs-binutils.diff
Type: text/x-patch
Size: 4095 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/binutils/attachments/20080402/ab90cf60/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: suse-vtrelocs-gcc.diff
Type: text/x-patch
Size: 27550 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/binutils/attachments/20080402/ab90cf60/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: suse-vtrelocs-glibc.diff
Type: text/x-patch
Size: 9918 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/binutils/attachments/20080402/ab90cf60/attachment-0002.bin>

More information about the Binutils mailing list