This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Proposal for a Guile binary file format


There has been talk about the efficiency of Guile.  A while ago I
posted benchmark figures which show that for a reasonably normal
program which uses integer arithmetic, only SCM (and, not shown in the
benchmark, MzScheme) beats Guile among the wellknown interpreters.

I didn't measure floating point arithmetic in Guile, but suspect it's
slow, since each double us stored in an individual malloc block,
meaning that *every* number, in-between results in computations and
everything, causes malloc and free calls.  It is possible to introduce
4-word cells so that doubles can be stored on the heap instead.  I
expect such a change to bring Guile floating point arithmetic up to
approximately the same speed as its integer arithmetic.

The message is that while Guile could be faster, it's not
inefficient.  (With exception of float arithmetic as explained above.)

I suspect (but don't know) that much of the startup time of Scwm
consists of parsing files and allocating objects.  Here's a proposal
for a Guile binary format which would speed up that part.

It can (I hope) be used at two levels:

1. Storing the internal representation for Guile source.
2. Storing memoized source.

The format is probably useful already for level 1.  Level 2 would gain
even more speed, but would probably introduce more dependencies.  (The
format spec would have to be extended somewhat to accomodate level 2.
I'm thinking of patch entries for ilocs, glocs etc.)

Unfortunately I won't be able to participate in a discussion around
the proposal, and will even less be able to implement it.  If there
are some explicit questions about unclear things, I will answer them,
but else it's kind of a "take it or leave it" thing.  I'm actually not
even sure that it would work.  It could be flawed.


Proposal for a Guile binary file format
=======================================

Last modified: 991117

A Guile binary file consists of an arbitrary number of module chunks.
The format of a module chunk is optimized for fast loading of a
module.  Normally one file corresponds to one module, but the format
is designed so that several modules can be stored in one file and that
a new Guile binary file can be constructed by simply concatenating two
others (just as is possible with Guile source files).

The format *is* architecture dependent, so sources files always have
to be recompiled on a new architecture.  Rationale: Since a major part
of Guile binaries consists of list structures the conversion between
big and little endian pointers would probably entail substantial
overhead.

Format
------

A module chunk consists of 8 blocks:

1. Header
2. Permanent heap block
3. Permanent malloc block
4. Standard heap block
5. Standard malloc block
6. Symbol table
7. Patch table
8. Entry point

The rationale for the "permanent" and "standard" classes is that
"permanent" memory can avoid repeated consing or calls to malloc while
"standard" memory can be GC:d.  "permanent" is good for source code
while "standard" is good for expressions.

Phases of load
--------------

Loading of a chunk has 6 phases:

I. Read header

This will give information of the sizes of the
rest of the blocks.

II. Read blocks 2 and 3

It's possible to allocate one single permament block of malloced
memort which will contain blocks 2 and 3.  This block is freed first
when the module is unloaded.  (Unloading of a module is not supported now.)

III. Read blocks 4-7

Blocks 4-7 will be read into a single temporary block of malloced
memory which will be freed after the load operation.  This is because
information in blocks 4 and 5 will be copied to other places and block
6 have meaning only during the link phase after the last read
operation.

IV. Intern symbols

Intern all symbols in the symbol table and store them in a vector.

V. Relocate

Relocate pointers and patch in symbols.  During this phase, contents
of block 4 will also be relocated into the real heap, and mallocs from
block 5 will be allocated and initialized.

VI. Initialize module

Evaluate the expression obtained by consing 'begin onto the list
pointed to by the entry point.

Format revisited
----------------

1. Header

struct scm_binfmt_header {
  unsigned long cookie;
  short version;		/* version and revision of the binary format */
  short revision;
  size_t perm_heap_size;
  size_t perm_malloc_size;
  size_t standard_heap_size;
  size_t standard_malloc_size;
  size_t symbol_table_size
  size_t patch_table_size;
};

2. Permanent heap block

The entire memory block containing chunk blocks 2-3 will be wrapped as
a smob and block 2 will be scanned and all non-immediates marked
during GC.

3. Permanent malloc block

Contains all malloced data of objects in block 2.  It is important
that every such piece of data is padded at the end so that each malloc
is word alligned.  Otherwise the patch process won't work correctly,
and some architectures would get into trouble.

This block will be the non-marked tail of the smob mentioned under 2.

4. Standard heap block

An array of heap cells which are copied to the real heap.  When
copying a cell, a "broken heart", that is the address of the new cell,
is stored in the CAR of the old cell so that the relocation process
can find it.

5. Standard malloc block

An array of malloc entries:

struct scm_malloc_entry {
  unsigned long size = SIZE;
  unsigned char data[SIZE];
};

When copying a malloc entry, a "broken heart", that is the real
address of the new memory block for this entry is stored at the start
of the malloc entry so that the relocation process can find it.

6. Symbol table

An array of malloc entries where the data is the symbol names.

7. Patch table

An array of path entries.  A patch entry is an unsigned long which is
simply a byte offset of the pointer to be patched.  It is relative to
block 2 or block 4.

The patching process is supposed to restore pointers to their "real"
value. For offsets into blocks 2-3 this means adding the base address
of block 2 to the pointer.  For offsets into blocks 4-5 it means
replacing the old pointer value with the broken heart pointed to by
the pointer.

For a symbol entry, the old pointer is an offset into the symbol
table.  Load phase IV will have replaced the symbol entry at that
location with the real pointer value.

The patch table has the following sections:

1. Header with 3 ints indicating number of entries in each of patch
   table sections 2-4.
2. Relocations of permanent memory
3. Relocations of standard memory
4. Symbols

8. Entry point

This is simply a patch entry of the same format as in the patch table.
(Probably a relocation entry for standard memory.)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]