This is the mail archive of the guile@cygnus.com mailing list for the Guile project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Re: `undefine'

To: Jost Boekemeier <jostobfe@calvados.zrz.TU-Berlin.DE>
Subject: Re: `undefine'
From: Mikael Djurfeldt <mdj@nada.kth.se>
Date: 12 May 1999 21:08:48 +0200
Cc: guile@cygnus.com
Cc: mdj@nada.kth.se
References: <199905120936.JAA08749@linux.zrz.TU-Berlin.DE>
Jost Boekemeier <jostobfe@calvados.zrz.TU-Berlin.DE> writes:

> does someone know a scheme implementation that supports 
> `undefine'?  `undefine' and `environment-undefine' should remove
> a binding completely but I have no idea how this could be done.

This can't be fixed without the unmemoization protocol.  If you want
to understand why, it may be interesting to study how the evaluator
replaces references to bindings with ilocs and glocs.

> Mikael said that he thinks there is a proposal of an "unmemoization
> protocol" somewhere.  Where do I find this proposal?

Here (many things are not up-to-date):

\input texinfo
@c %**start of header
@setfilename internals.info
@settitle Spaces
@setchapternewpage off
@c %**end of header

@c Most of the operations here are defined as "Primitives" (in other
@c words, they're written in C), but I think some of them could be coded in
@c Scheme, if we wanted to, without much of a performance hit.  Rename
@c those which are not too important to provide at the C level "Functions."

[[allow some bindings to be exported read/write]]

[["name" spaces, so we can reload spaces and have the spaces which refer
to them automatically see things.  That makes "spaces by name" into a
coherent chapter, too. Modules proper are a subclass of named spaces,
that have other slots about syntax, etc.]]

[[export spaces consult local spaces to decide what to export?  Better
match with SCSH module system.]]

[[per-binding GLOC lists?  Resolve other memoization issues first.]]

[[caches in eval spaces]]

@node Spaces
@chapter Spaces

Guile represents source code modules using first-class objects called
@dfn{spaces}; a space object represents an environment in which one can
look up bindings for symbols, and possibly create new bindings.  (The
name comes from the term ``name space''.)  Guile's @code{eval} function
evaluates its first argument, an s-expression representing code to
evaluate, in the environment described by its second argument, a space.

Spaces map symbols onto bindings.  There is a natural correspondence
between identifiers in Scheme programs and Scheme symbol objects: an
identifier corresponds with the symbol of the same name.  To find the
binding for a top-level variable, Guile looks up the corresponding
symbol in a space object which represents the top-level environment.

Guile provides functions to inspect spaces, create, modify, and delete
bindings, and to combine spaces in various ways.  The user can easily
invoke these operations from both Scheme and C code.

Guile Scheme differs from Common Lisp and Emacs Lisp in that symbol
objects do not have values of their own.  Each space maps symbols onto
bindings; different spaces may bind a given symbol to different values;
a symbol may be bound in one space and unbound in another.  This means
that the question ``What is the value of this symbol?'' is not really
meaningful on its own; we need to know which space provides the context
for the question.

This chapter describes spaces, and the functions that operate on them.
It does not explain how to define and use modules; those topics are
covered in [[what???]].

@menu
* Simple Space Operations::     Inspecting and modifying a space's bindings.
* Combining Spaces::            Importing, exporting, and local bindings.
* Local Spaces::                Spaces that hold internal definitions.
* Import Spaces::               Spaces that combine bindings from others.
* Evaluation Spaces::           Spaces to evaluate code in.
* Export Spaces::               Restricting visible bindings.
* Reference Spaces::            Spaces which simply forward to other spaces.
* Finding the Source of a Binding::  Who bound this symbol?
* Modules and Spaces::          How modules are implemented in terms of spaces.
* Spaces at C level::           How to get at this stuff from C.
* Primitive Spaces::            How C code can easily create a space of
                                primitives for use by Scheme.
@end menu


@node Simple Space Operations
@section Simple Space Operations

Here are functions to create, modify, delete, and inspect a space
object's bindings.  We don't discuss where spaces come from here;
maybe when you're older.  @xref{Modules and Spaces}.


First, some type predicates:

@deftypefn {Scheme Procedure} {} space? @var{obj}
@deftypefnx {C Procedure} SCM scm_space_p (SCM @var{obj})
Return a true value iff @var{obj} is a space object.
@end deftypefn

@deffn Primitive space-parameters space
Return information about @var{space}.  The return value is a list whose
first element is a symbol indicating what kind of space @var{space}
is.  Remaining elements (if any) give further information.

For details on what @code{space-parameters} returns for a given kind of
space, see the documentation on that space type.  For the standard
Guile space types, see @ref{Local Spaces}, @ref{Import Spaces},
@ref{Evaluation Spaces}, and @ref{Export Spaces}.
@end deffn


The following functions inspect a space's bindings.  We refer to these
functions taken as a group as the ``binding inspection functions''.

@deffn Primitive space-bound? space symbol
If @var{symbol} is bound in @var{space}, return @code{(@var{binding})},
where @var{binding} is the value of @var{symbol} in @var{space}.  If
@var{symbol} is unbound, return @code{#f}.
@end deffn

@deffn Primitive space-ref space symbol
Return the value of @var{symbol} in @var{space}.  If @var{symbol} is
not bound in @var{space}, throw a @code{unbound-identifier} exception.
@end deffn

@deffn Primitive space-fold space func base
For each binding in @var{space} which maps some symbol @var{sym} to a
value @var{val}, apply @var{func} to the three arguments: @var{sym},
@var{val}, and the value returned by the last call to @var{func}.  Pass
@var{base} as @var{func}'s last argument the first time.

For example, here is a function which returns a space's bindings
as an association list:
@lisp
(define space->alist
  (lambda (space)
    (space-fold space
                (lambda (sym var rest) 
                  (cons (cons sym var) rest))
                '())))
@end lisp
@end deffn


The following functions, can change the value to which a space binds a
symbol, and possibly change whether or not the symbol is bound at all.
We refer to these functions taken as a group as the ``binding mutation
functions.''

Some kinds of spaces are immutable, and do not support these operations.
However, if a space does support them, a change to a binding takes
effect immediately in all spaces in which that binding is visible.

@deffn Primitive space-define! space symbol value
Bind @var{symbol} in @var{space} to a storage location initialized to
@var{value}.  If @var{symbol} is already bound in @var{space},
change its value to @var{value}.
@end deffn

@deffn Primitive space-set! space symbol value
Set @var{symbol}'s value in @var{space} to @var{value.}

If @var{symbol} is not bound in @var{space}, throw an
@code{unbound-identifier} exception.

If @var{symbol} has a read-only binding in @var{space}, throw a
@code{variable-read-only} exception.
@end deffn

@deffn Primitive space-undefine! space symbol
Make @var{symbol} unbound in @var{space}.  If @var{symbol} was already
unbound, this has no effect.
@end deffn


@node Combining Spaces
@section Combining Spaces

A module of Scheme code actually consists of several distinct name
spaces:
@itemize @bullet
@item
its @dfn{imported} bindings, which it receives from other spaces;
@item
its @dfn{local} bindings, made by the space's code itself; 
@item
its @dfn{evaluation} bindings, the combination of the above two; and
@item
its @dfn{exported} bindings, which are visible to the outside world.
@end itemize

The module's own code may refer to its own local bindings, as well
bindings imported from other modules.  A module usually exports some of
its local bindings, perhaps all of them.  A module may occasionally
re-export some of its imported bindings.  For a given module, the four
namespaces will all be distinct.

For example, consider the following code:
@lisp
(define-module (hash))
;; Return the hash value of KEY.
(define (hash-value key) ...)
;; Construct a new, empty hash table, and return it.
(define-public (make-hash) ...)
;; Associate KEY with VALUE in HASH.
(define-public (hash-set! HASH KEY VALUE) ...)
;; Return the value associated with KEY in HASH.
(define-public (hash-ref HASH KEY) ...)

(define-module (user-table) :use-module (hash))
;; Construct a new user table, and return it.
;; This binding is not visible outside the user-table module.
(define users (make-hash))
;; Add a new user.
(define-public (add-user username realname password)
  (hash-set! users username (list realname password)))
;; Return USERNAME's real name.
(define-public (user-realname username)
  (car (hash-ref users username)))
;; Return USERNAME's password.
(define-public (user-password username)
  (cadr (hash-ref users username)))
@end lisp

[[I don't mean to endorse the syntax of the current module system here.
I'm just using it to illustrate a point about the underlying
namespaces.]]

In the @code{user-table} space:
@itemize @bullet
@item
The imported bindings are @code{make-hash}, @code{hash-set!}, and
@code{hash-ref}, and the standard R4RS Scheme bindings.
@item
The local bindings are @code{users}, @code{add-user},
@code{user-realname}, and @code{user-password}.  The variable
@code{make-hash}, for example, does not have a local binding; its value
is imported.
@item
The evaluation bindings are all the imported bindings, as well as
all the local bindings.  Thus, the evaluation bindings include
@code{make-hash} (imported) and @code{users} (local), among others.
@item
The exported bindings are @code{add-user}, @code{user-realname},
and @code{user-password} --- only those definitions made with
@code{define-public}.  The binding of @code{users} is not exported.
@end itemize

All of these name spaces are interesting to inspect under different
circumstances.  However, it would be cumbersome to have a different set
of inspection functions for each namespace, perhaps named
@code{space-local-set!}, @code{space-exported-ref}, and so on.

So, Guile represents each of these namespaces with a separate space
object.  Thus, a Scheme module actually consists of four space objects:
a local space to carry the local bindings, an import space to combine
bindings from other modules, an evaluation space which combines the
import space and the local space and performs mutations by passing them
on to the local space, and an export space which selects a subset of the
evaluation space's bindings to show to the world.  Given a module
object, you can use its accessor functions to find the inner space
objects it encapsulates, and then use the ordinary space inspection and
mutation functions on them.

In a sense, this means that Guile uses the word "space" for two
different things: one, a collection of Scheme code which imports
bindings from other spaces, and exports some bindings to other spaces,
and two, an object representing a namespace.  We do not use two separate
terms, however, because the meaningful operations on the first kind of
space are the same as those on the second kind, and also because we
expect the distinction to be clear from context.  [[Not sure about this;
we could use 'spaces' for the objects, and 'spaces' for the Scheme-level
things; what do folks think?]]

Different space types support different operations.  For example, you
cannot apply @code{space-define!} to import or export spaces; that is
not really meaningful.  However, you can apply @code{space-define!} to
evaluation and local spaces.  Space types may support special
operations: for example, only export spaces support the
@code{space-export!} function.

Most of the time, when one deals with spaces, one deals with export
spaces, because they are all one needs to use the interface the space
wishes to provide to the world.  The other space types come into play
only when one is writing interpreters, command processors, and other
such reflective tools.

It is possible to add new space types in C code.  Shared libraries
which need to publish a set of Scheme primitives in an organized way may
construct and name a space object, but have no need for imports and
evaluation environments; they simply want to create a set of immutable
bindings.  See @ref{Spaces at C level} for details on how to do this.

To support efficient compilation, some spaces may be declared
immutable.  This means that, once the space has been loaded, the set of
imported spaces, the local bindings, and the set of exported bindings
may not be changed with the mutation functions described in this
chapter.  If an immutable space imports another immutable space, the
compiler can assume that the imported bindings will never change, and
perform more aggressive optimizations (inline primitives, etc).  At the
moment, Guile does not provide mechanisms for doing this, but we hope to
in the future.

[[The variety of space types seems gratuitous, until you realize that
each type corresponds to a namespace you do actually want to be able to
query.  And at that point, you have to choose between either having a
ton of accessors whose names specify what namespace you mean, or having
a bunch of different types, as I've done here.  All the code needs to be
there anyway; this arrangement just has a bit more structure.  I think.
Maybe I'll commit sepuku after trying to implement it.]]

[[There should be some well-defined way for a space's code to get its
space object.  The (current-space) function would have to be really
weird (inspecting its call site) to do this; but if it were a syntactic
form, that would be well-defined; it would evaluate to the space in
which it appears.]]

@node Local Spaces
@section Local Spaces

A @dfn{local space} is one that holds the top-level bindings created by
the space's code itself: those created with @code{define},
@code{define-public}, @code{define-syntax}, and so on.  A local space
is the simplest space type: it is essentially a hash table, mapping
symbols onto their values.  In Scheme code, all bindings spring from a
local space somewhere, and perhaps get exported and imported into other
spaces, where they are used.

Local spaces support the binding mutation functions; see @ref{Simple
Space Operations}.

@deffn Primitive make-local-space
Create a new local space, containing no bindings.  You can populate
this space using the @code{space-define!} function.
@end deffn

When applied to a local space, @code{space-parameters} function
returns a single-element list: @code{(local)}.


@node Import Spaces
@section Import Spaces

An @dfn{import space} is one that combines the exported bindings of a
list of spaces into a single namespace, for use by local code.  Import
spaces can apply a prefix bindings imported from a given space.
Import spaces check that the same symbol is not bound by more than one
imported space, and warn the user if the situation arises.

Import spaces do not support the binding mutation functions; see
@ref{Simple Space Operations}.  Attempting to mutate an import space
throws a @code{wrong-type-arg} exception.

@deffn Primitive make-import-space import-list
Create a new import space which combines the spaces specified in
@var{import-list}.  Each element of @var{import-list} has one of the
following forms:
@table @code
@item (@var{space})
Include bindings from @var{space} in the resulting namespace.
@item (@var{space} @var{prefix})
For each symbol @var{sym} bound in @var{space}, include its binding
under the symbol whose name is the concatenation of @var{sym}'s name and
@var{prefix}, a string.
@end table

Let @var{imp} be @code{(make-import-space @var{import-list})}.

A variable @var{sym} is bound in @var{imp} if either:
@itemize @bullet
@item
there exists some space @var{mod} such that @code{(@var{mod})} is
an element of @var{import-list}, and @var{sym} is bound in @var{mod}, or
@item
there exists some space @var{mod} such that @code{(@var{mod} prefix
@var{prefix})} is an element of @var{import-list}, @var{sym}'s name
starts with @var{prefix}, and the remaining suffix is bound in
@var{mod}.
@end itemize
If either of the above holds, @var{imp} shares the binding with the
appropriate @var{mod}.

If more than one element of @var{import-list} could provide a binding
for any identifier in @var{imp} under the above rules,
@code{make-import-space} throws an @code{import-conflict} exception.

[[Should there be a way to import only certain bindings from a space,
or is something like in-space (i.e. (define foo (in-space (blah)
foo))) sufficient?]]

If the set of bindings provided by any of the imported spaces changes,
the bindings visible in @var{imp} is immediately updated.

This function checks that the spaces mentioned in @var{import-list} are
indeed space objects, it does not check what type of spaces they are
(although they will typically be export spaces).  This allows one to
use the eval space's behavior with newly defined space types.
@end deffn

@deffn Primitive space-import-list space
Return a copy of the @var{import-list} for @var{space}.   If
@var{space} is not an import space, throw a @code{wrong-type-arg}
exception.
@end deffn

@deffn Primitive space-set-import-list! space import-list
Change @var{space}'s import list to @var{import-list}.  The new set of
bindings is immediately visible to all functions using @var{space}.  If
the import list is not consistent (according to the rules described for
@code{make-import-space}), then this function throws an
@code{import-conflict} exception.
@end deffn

When applied to an import space, the @code{space-parameters} function
returns a list of the form @code{(import @var{import-list})}.

[[perhaps we shouldn't require collision detection at this level;
instead, we should let the layer above take care of it...]]


@node Evaluation Spaces
@section Evaluation Spaces

An evaluation space combines the namespaces of a local space and an
import space to produce a namespace appropriate for evaluation spaces.
Given a symbol reference, an evaluation space searches the local space
first, then the import space.  An evaluation space passes binding
mutation operations to the local space.  Thus, an evaluation space is
appropriate as a context for evaluating Scheme code, @i{i.e.} as a
second argument to @var{eval}.

Evaluation spaces support the binding mutation functions; however,
for @code{space-set!}, if the local space does not have a binding for
the symbol but the import space does, then @code{space-set!} will
throw a @code{read-only-binding} exception.

@deffn Primitive make-eval-space local import
Create a new evaluation space, using @var{local} as its local space,
and @var{import} as its import space.

All binding inspection functions search for bindings in the local space
first, then the import space.  Local bindings may shadow imported
bindings without ill effect.

Binding mutation functions apply to the local space, with the exception
described above.

If the set of bindings made by @var{local} or @var{import} changes, the
changes are immediately visible in the returned space.

This function checks that @var{local} and @var{import} are space
objects, but does not check what type of spaces they are.  This allows
one to use the eval space's behavior with newly defined space types.
@end deffn

@deffn Primitive space-eval-local space
@deffnx Primitive space-eval-import space
Return the local or import space of @var{space}.  If @var{space} is
not an eval space, then throw a @code{wrong-type-arg} exception.
@end deffn

When applied to an eval space, the @code{space-parameters} function
returns a list of the form @code{(eval @var{local} @var{import})}.


@node Export Spaces
@section Export Spaces

An export space selects a subset of the bindings from an evaluation
space for publication to the outside world.

Export spaces do not support the binding mutation functions.
Attempting to mutate an export space throws a @code{wrong-type-arg}
exception.

@deffn Primitive make-export-space space export-list
Create a new export space, selecting bindings from @var{space} as
described by @var{export-list}.  The @var{export-list} must be a list of
symbols.

Let @var{exp} be the space returned by this function.

If @var{sym} is a symbol bound by @var{space}, and @var{sym} is an
element of @var{export-list}, then @var{sym} is bound in @var{exp}.  The
@var{export-list} may contain symbols not bound in @var{space}; this
will not cause an exception when @code{make-export-space} is called,
but references to such symbols in @var{exp} throw an
@code{unbound-identifier} exception.

If the set of bindings of @var{space} changes, and @var{export-list}
exposes any of the symbols whose bindings or binding states have
changed, then the new set of bindings is immediately visible in
@var{exp}.
@end deffn

@deffn Primitive space-set-exports! space export-list
Change the export list of @var{space} to @var{export-list}.  
@var{space} must be an export space.
@end deffn

@deffn Primitive space-exports space
Return the export list of @var{space}, which must be an export space.
@end deffn

@deffn Primitive space-add-export! space symbol
Add @var{symbol} to the export list of @var{space}.

This can certainly be implemented in terms of @code{space-set-exports!},
but we provide it as an optimization for @code{define-public}.
@end deffn

When applied to an export space, the @code{space-parameters} function
returns a list of the form @code{(export @var{export-list})}.


@node Reference Spaces
@section Reference Spaces

[[I've got some half-baked thoughts about allowing a space named (FOO)
to handle resolution of all names (FOO BAR ...).  Then we could have a
root space named (), which implements all the path searching semantics,
etc, and could be replaced for non-filesystem-based environments.  Also,
we could integrate other space systems into Guile's; for example (slib
mumble) could do an slib-style (require 'mumble) and pack that up as a
space.  If we try to find a space (FOO BAR ...), and there's a
subdirectory in the path called FOO, we could load a file called
FOO/guilemod.scm (think index.html), and the space it creates would be
used to resolve BAR, etc.  The root space could provide loading
FOO/BAR.scm as default behavior.]]

[[The REPL has a current space associated with it; that's what we pass
as the 2nd argument to @code{eval}.  (Single-argument eval will have to
go away, I think.  It's not in R5RS anyway.)  We'll start out evaluation
in a space called (repl); if we teach Guile how to accept RPC
connections (like wish does), they'll be called (net 1), (net 2), etc.
We could have a "cd"-like operation which sets the REPL's current space,
so we can go inside some space we've loaded and evaluate code in its
evaluation space.]]


@node Finding the Source of a Binding
@section Finding the Source of a Binding

For reflective applications, it can be helpful to find out how a given
symbol's binding found its way into a particular space.  The following
function allows one to discover that.

@deffn Primitive space-binding-source space symbol
Return a list of spaces through which the binding of @var{symbol} was
inherited.

All spaces, except for local spaces, simply combine namespaces
produced by other spaces according to some rule.  If @var{space}
provides @var{symbol}'s binding itself, this function returns
@code{(@var{space})}.  If @var{space} receives its binding for
@var{symbol} through some other space @var{mod2}, then this function
returns @code{(@var{space} . @var{rest})}, where @var{rest} is
@code{(space-binding-source @var{mod2} @var{symbol})}.

If @var{symbol} is unbound in @var{space}, this function throws an
@code{unbound-identifier} exception.
@end deffn


@node Modules and Spaces
@section Modules and Spaces

[[How do we implement modules, given spaces?]]


@node Spaces at C level
@section Spaces at C level

[[overview here]]

@menu
* The Space Smob::              The scheme type representing a space.
* Memoizing Bindings::          How to make binding lookup efficient.
@end menu

@node The Space Smob
@subsection The Space Smob

[[Basically, spaces are a smob, containing a pointer to private data, a
"observer list" (see caching protocol below), and a pointer to a
structure of function pointers (struct scm_mod_funs) to implement the
binding inspection and mutation functions, the @code{space-parameters}
function, and functions to implement the space name delegation stuff
described above.

This makes each space type's code really simple: it'll mostly just pass
things through to its subspaces as appropriate.  We can add new types
really easily; nobody has to know anything about anyone else.  We could
trivially allow people to do the stuff in Scheme, and get the nice
soft-space functionality we've got now.  The interpreter, however, only
goes through the dispatch functions, so it doesn't care who's
implementing things.]]

@example
struct scm_mod_funs
@{
  /* self-description */
  SCM (*parameters) (SCM space);
  SCM (*print) (SCM exp, SCM port, scm_print_state *pstate);

  /* binding inspection functions */
  SCM (*bound_p) (SCM space, SCM symbol);
  SCM (*ref) (SCM space, SCM symbol);
  SCM (*fold) (SCM space, SCM func, SCM base);

  /* binding mutation functions */
  SCM (*define_x) (SCM space, SCM symbol, SCM value);
  SCM (*set_x) (SCM space, SCM symbol, SCM value);
  SCM (*undefine_x) (SCM space, SCM symbol);

  /* tracing bindings */
  SCM (*source) (SCM space, SCM symbol);

  /* memoization protocol */
  SCM (*memoize) (SCM space, SCM symbol, SCM cell, int assignment_p);
  void (*update) (SCM space);
@}
@end example


@node Memoizing Bindings
@subsection Memoizing Bindings

[[What about unmemoizing?]]

This chapter describes how Guile's space system interacts with the
Guile interpreter, to keep references to top-level variables fast.  Only
programmers implementing new types of spaces should need to read this
section.

A space finds a symbol's binding either by looking up the symbol in a
hash table (as for a local space), or by delegating the search to other
spaces.  This means that the full symbol lookup process is too
expensive to perform on each variable reference.  Thus, Guile uses
memoization and caching to avoid repeating the search for a given
variable's binding, unless the space's configuration has changed.

All space bindings live in ordinary pairs; a pair used for this purpose
is called a @dfn{value cell}.  A value cell's @sc{car} holds the
variable's value, and its @sc{cdr} holds a pointer to the symbol naming
to the variable (for unmemoizing purposes).  @footnote{Older versions of
Guile had ``first-class'' variables, but those are not necessary or
useful in this space system.}

[[Actually, this is broken: we can't store the unmemoizing information
in the value cell.  The variable might be referenced under different
names; we can't put them all in the value cell's cdr.  Maybe the car
of a GLOC points to a pair whose car is the original symbol, and whose
cdr points to the value cell.  That's an extra indirection on each
memory reference, but I think we have to be able to reconstruct the
source exactly.  Maybe the spaces' GLOC lists can store the original
symbol?]]

As the interpreter evaluates code, it turns each variable reference
(represented as a pair whose @sc{car}s is a symbol, and whose @sc{cdr}
points to the remainder of an expression) into a GLOC, a structure which
points directly to the variable's value cell.  Thus, the first time the
interpreter resolves a variable reference, it makes the value cell
immediately accessible, and need not consult the space system in the
future to evaluate that reference.

However, spaces may acquire new bindings, import lists may change, and
export lists may change.  In these situations, the binding a given
variable reference denotes may change, and the GLOC may become out of
date.  The interpreter needs to re-consult the space system to find the
variable's binding.  We need a protocol by which spaces can tell the
interpreter to check variables' bindings again.

@menu
* The Memoization Protocol::    How we handle this hair, generally.
* Memoization Protocol Functions::  The specifics.
@end menu


@node The Memoization Protocol
@subsubsection The Memoization Protocol

[[Actually, why do spaces need to do anything other than let us know
when their visible bindings have changed?  Can we leave most of this
protocol be implemented by common code, which the appropriate space
operations invoke?  That would make space implementations more
independent of the design of the interpreter; we could switch to a
bytecode interpreter and keep any space types people have implemented.]]

Here's how the memoization protocol works: when the interpreter
encounters a reference to a top-level variable, it tells the space to
memoize the reference on its behalf.  The spaces are responsible for
creating GLOCs.  Once a reference has been replaced with a GLOC, the
interpreter can find the variable's value without consulting the space
system.

Each space must keep a list of GLOCs it has created for the
interpreter.  Each space must also register itself as a ``observer'' of
each space whose bindings it cares about.  Then, when a space does
something which changes its visible bindings, it invokes its own
@code{update} function.  Each space's @code{update} function reverts
all the GLOCs in its GLOC list to ordinary variable references, and then
recurses on its observers.  The interpreter will re-memoize the variable
references when it next visits them.

Note that even evaluating `define' will require an update to occur.
However, when Guile loads a new space, the space will not have any
observers yet, so the updates will do nothing.  Furthermore, if a space
knows that it has performed no variable lookups since the last update
(the most common case), it need not do anything for an update, and it
need not disturb its observers.  I believe that most updates will only
have notable effect when there is actually a danger of GLOCs being out
of date.

The above protocol does end up invalidating more variable references
than necessary: we we add a binding for `foo', we don't need to
unmemoize references to `bar'.  We could pass the affected variable's
value cell along as part of the update protocol, and have each space
unmemoize only those GLOCs referring to that value cell, but that
still incurs the overhead of scanning the GLOC list, and doesn't help us
when the update is caused by a modified import or export list, and so
on.

Spaces may actually prefer to store the GLOC list in a vector, to keep
bookkeeping overhead down.

Note that if spaces of a given type have bindings which never change,
that type can implement the protocol trivially.  It will need to create
GLOCs for its bindings, but since they will never change, it need not
store them in a GLOC table.  Its spaces may have observers, but it will
never need to update them.  If its bindings are unchanging, it is probably
not observing any other spaces, so it need not be prepared to handle
update requests.

Note also that, with the memoization protocol in place, export spaces
can safely cache the bindings they provide to the outside world, keeping
lookup overhead to a minimum.


@node Memoization Protocol Functions
@subsubsection Memoization Protocol Functions

Here are functions for manipulating a space object's observer lists:

@deftypefun void scm_mod_add_observer (SCM @var{space}, SCM @var{observer})
Add @var{observer} to @var{space}'s observer list.
@end deftypefun

@deftypefun void scm_mod_del_observer (SCM @var{space}, SCM @var{observer})
Remove @var{observer} from @var{space}'s observer list.
@end deftypefun

@deftypefun SCM SCM_MOD_OBSERVERS (SCM @var{space})
Return a vector of spaces observing @var{space}.  The vector holds its
values weakly, and any element may be @code{#f}.
@end deftypefun

@deftypefun int SCM_MOD_NUM_OBSERVERS (SCM @var{space})
Return an upper bound on the number of spaces observing @var{space}.
All elements of @code{SCM_MOD_OBSERVERS (@var{space})} after the first
@code{SCM_MOD_NUM_OBSERVERS (@var{space})} elements are guaranteed to
be @code{#f}.
@end deftypefun

Here are the descriptions of the functions spaces must implement to
support the update protocol.  The function names here actually refer to
members of @var{space}'s @code{scm_mod_funs} structure.

@deftypefun SCM memoize (SCM @var{space}, SCM @var{symbol}, SCM @var{cell}, int assignment_p);
Memoize a top-level variable reference in the interpreter's code tree.
Return the value cell for @var{symbol}'s binding in @var{space}.
@var{cell} should be the variable reference to memoize, or #f; see
below.  @var{assignment_p} should be non-zero iff the reference is the
target of an assignment form (like a @code{set!}); again, see below.

If @var{cell} is not @code{#f}, it must be the cell to memoize; that is,
it should be the cell in the interpreter's code tree which holds the
reference to the top-level variable corresponding to @var{symbol}.  We
must memoize the reference: turn @var{cell} into a GLOC whose @sc{car}
points to the value cell for @var{symbol} in @var{space}.  Record
@var{cell} in a list local to @var{space}, so we can unmemoize it later
if necessary.

If @var{space} inherits @var{symbol}'s binding from some other space,
it should pass @code{#f} for @var{cell} when recurring on its subspaces
to find @var{symbol}'s value cell.  If @var{cell} is @code{#f}, don't
worry about memoizing anything, or putting anything in a GLOC list.
(This rule assures that each GLOC appears in the GLOC list of only one
space: the one corresponding to its top-level evaluation environment.)

Assignments to variables via this protocol should behave exactly like
assignments made via @code{space-set!}; thus, they should throw
exceptions under the same circumstances.  Given @var{assignment_p},
@var{space} should have enough information to know whether a given
memoization is legal.  Simply put, one should never memoize a binding in
an assignment position unless the binding is actually assignable.

[[The memoize function should return SCM_UNDEFINED if it can't find a
binding, to make tentative probes (i.e. for import and eval spaces)
unnecessary.]]

A binding for @var{sym} in @var{mod} is assignable iff:
@itemize @bullet
@item
@var{mod} is a local space which binds @var{sym}, or
@item
@var{mod} is an eval space, and @var{sym} is assignable in @var{mod}'s
local space.
@end itemize

If @var{assignment_p} is non-zero, but @var{sym}'s binding in
@var{space} is not assignable, then the @code{memoize} function must
throw a @code{read-only-binding} exception.

There's no need for any global code to know the rules for legal
assignments.  The @code{memoize} function of local spaces should simply
permit assignments.  The @code{memoize} function of eval spaces checks
for bindings in its local and import spaces, blocks assignments to
imported bindings, and delegates the decision to the local space for
local bindings.  All other spaces' @code{memoize} functions reject all
assignments.
@end deftypefun


@deftypefun void update (SCM @var{space})
Some binding in @var{space}, or a space @var{space} observes, has
changed.  Unless you can prove that no binding in @var{space} has
changed, unmemoize all the GLOC's in @var{space}'s GLOC list, by
reverting them to symbol references, and invoke the @code{update}
functions of all spaces in @var{space}'s observer list.

Each space must record whether it has processed any @code{memoize}
requests since the last update (including requests whose @var{cell}
parameter was @code{#f}).  If the space's @code{update} function is
invoked without any intervening @code{memoize} calls, it should do
nothing.  It is probably a single update visiting the same space twice,
but in any case, the space need take no action.
@end deftypefun


@node Primitive Spaces
@section Primitive Spaces

[[Pretty obvious stuff.  This should be the standard way for C code to
define Scheme-visible primitives, and so shouldn't be hidden here in the
back of the section on spaces.]]

It is common for C code linked with Guile to want to publish a set of
bindings as a space.  To do this, the code's initialization function
should do the following:
@enumerate 1
@item
Call @code{scm_make_primitive_space} to create a new, empty space.
@item
Call @code{scm_name_primitive} to populate the space with the
appropriate bindings [[we should make a new snarfing macro to help with
this]].
@item
Call @code{scm_name_primitive_space} to make the space visible to the
Scheme world.
@end enumerate

@deftypefun SCM scm_make_primitive_space (void)
Create and return a new, empty primitive space.

Primitive spaces do not support the binding mutation
functions.@footnote{This restriction allows it to implement the space
update protocol very simply; see @ref{The Memoization Protocol}.
Otherwise, I think the implementation can share a lot of code with the
local space type.}

When applied to a space returned by this function,
@code{space-parameters} returns the singleton list @code{(primitive)}.
@end deftypefun

@deftypefun void scm_name_primitive (SCM @var{space}, char *@var{name}, int @var{req}, int @var{opt}, int @var{rest}, SCM (*@var{function})())
Define a new primitive in @var{space} named @var{name}, based on the C
function @var{function}.

The primitive accepts @var{req} required arguments, @var{opt} optional
arguments, and a @var{rest} argument iff @var{rest} is non-zero.  The C
function @var{function} should accept @code{@var{req} + @var{opt}}
arguments, or @code{@var{req} + @var{opt} + 1} arguments if @code{rest}
is non-zero.

When a primitive is invoked, it must be applied to at least @var{req}
arguments, or else Guile signals an error.  Guile passes the primitive's
first @var{req} arguments to @var{function} as its first @var{req}
arguments.  If there are fewer than @var{opt} arguments remaining, then
@var{function} receives the value @code{SCM_UNDEFINED} for any missing
optional arguments.  If @var{rest} is non-zero, then any arguments after
the first @code{@var{req} + @var{opt}} are packaged up as a list, and
passed as @var{function}'s last argument.

@var{space} should be a space object returned by
@code{scm_make_primitive_space}.
@end deftypefun

@deftypefun void scm_name_primitive_space (char *@var{name}, SCM @var{space})
Make @var{space} visible to the Scheme world under the name
@code{(@var{name})}.
@end deftypefun

@bye
Follow-Ups:
- Re: `undefine'
  - From: Mikael Djurfeldt <mdj@nada.kth.se>
- Re: `undefine'
  - From: Jost Boekemeier <jostobfe@calvados.zrz.TU-Berlin.DE>
References:
- `undefine'
  - From: Jost Boekemeier <jostobfe@calvados.zrz.TU-Berlin.DE>
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]