Library to assemble from memory

Daniel Diaz Daniel.Diaz@univ-paris1.fr
Mon Apr 29 10:36:00 GMT 2002


Hi everybody,

I'm new to this list so please excuse me if I post an already answered
question (but I didn't found anything in the archives).

I'm the author of GNU Prolog and I use (g)as as final back-end for my
native compilation scheme for Prolog programs. So, from a Prolog source
I genereate a .s file which is compiled with gas.

The main problem is that all intermediate files are large and
particularly the asm file (e.g. when Prolog abstract instructions are
inlined for efficiency or when compiling data bases). And finally the
produced object is not so big (2 Mb for instance). It appears that a lot
of time is wasted in disk I/O. I'm reorganizing my compiler to avoid my
intermediate files (and keeping them in memory) but I cannot do the same
for the .s file beacause gas is a command-line tool accepting a file. I
would prefer avoid to play with pipes for portability reasons (under
native win32 - arg !) and I would prefer to generate the .o from
assembly source stored in memory (as a string for instance). So it would
be very usefull to have a library to perform assemble from memory. In
order to speed up the parsing (tokenizer) it would be interesting to
have a way to inform gas that the input is "simple" (e.g. only contains
1 separator character between tokens, ...). The library should allow the
user to assemble by portions (e.g. the assembly code of a function,
later the asm code of another function,...) this woul allow the user to
carrefully tune his memory consumption, freeing an already assembly
region (i.e. string in memory).

I'm sure this would be VERY useful for many compiler developpers.

We can think about an API, here is a proposition:

int gas_start_assembler(char *output_path_name, char *options);
   this prepares the assembler, opening for writing output_path_name.
   options contains the command-line options (machine dependent and
others) + an option
   specifying that the input is simple.
   The function returns a descriptor which identifyies this assembler 
   or -1 on error (+ errno). This descriptor makes it possible to use
several "assemblers"
   in a same program - (this means that all gglobal variables should be
stored in
   mallocated region and referenced by the descriptor). 
   This facility could be turned off and only one assembler should be
enough, for the
   rest I suppose we have this facility (else remove the as_desc
argument)

int gas_assemble(int as_desc, char *asm_data);
   this assembles on the output file the content of asm_data according
to the current
   assembler identified by as_desc.
   The return value is an error code.

int gas_stop_assembler(int as_desc);
   this finishes the assembly process associated to as_desc 
   (close the file, free some memory regions,...).
   The return value is an error code.

The library could be provided as a libgas.a and/or libgas.so.
  
Obviously, gas itself could be rewritten in few lines using the
library...

Reactions ?



More information about the Binutils mailing list