Retrieving function addresses in an object file

Tue Apr 6 14:59:00 GMT 2010

Hi Fredric,

> I am in the process of creating a simple, custom object file format

Note that your email really only talks about the file format for 
executables.  You should also consider the file format for object files, 
archives and maybe shared libraries if you are going to support them.

> The question is how to accomplish this with ld and gcc?

Essentially this is all going to be handled in the linker, with very 
little participation by gcc.  (Probably the only thing that gcc will 
have to do is to mark a given symbol as being a function entry point).

> It would be very convenient if one could do something like the following in the source code:
>
> /*pseudo code*/
> function_start function_2 = "offset from start of object file to start of function_2"

Presumably you are talking about the source code of the loader here, yes ?

 > Is it possible to achieve something like this?

Yes, but it is going to take some work on your part.  Here is how I 
imagine it would work:

   Application source code:

     int bar (void) { return 0; }
     int main (void) { return bar (); }

   Compiled code (in pseudo-assembler):

       .global bar
       .type bar, function
     bar:
        move #1, reg0
        rts

        .global main
        .type main, function
     main:
        call bar
        rts

   Note how the .type pseudo-op is used to indicate which symbols are 
function names.

   The assembler will convert the compiled code into an object file.  I 
would expect there to be a relocation generated for the CALL 
instruction.  This would tell the linker that at this point in the 
executable the program wants to call the function MAIN.  The linker 
might either insert real instructions that jump into the loader along 
with a parameter that is the numeric index of MAIN in the function 
lookup table, or else the relocation could be left in place and the 
loader could detect it a load-time and modify the CALL instruction to go 
directly to the MAIN function.  (This depends upon whether you want the 
code to remain position independent whilst it is running or not).

Of course this leaves the problem of what to do with function pointers ? 
  Essentially there are two possibilities.  If you want position 
independence whilst the code is running, then function pointers will 
have to actually be indicies into the function lookup table.  Whereas if 
you only want position independence up to the point where execution 
starts then function pointers can be the real address of the functions, 
and the loader would have to patch every place where the address of a 
function is computed.

Cheers
   Nick