This is the mail archive of the
guile@sourceware.cygnus.com
mailing list for the Guile project.
set comprehensions in Guile?
- To: guile at sourceware dot cygnus dot com
- Subject: set comprehensions in Guile?
- From: David Tanzer <tanzer at cs dot nyu dot edu>
- Date: Thu, 9 Mar 2000 20:39:48 -0500 (EST)
Hi,
Jim Blandy's note about the desirability of connecting query languages
with Guile struck a chord with me. I'm doing research on
query-processing, and was wondering about how to go about using set
theory as a query language, within the guile environment.
There are three sections to this message:
BACKGROUND, QUESTIONS, and HALF-BAKED IDEAS.
\begin{BACKGROUND}
The language of set theory provides a nice formalism for expressing
queries, as is well-known, e.g., the fragment of set theory
called ``relational calculus''. For instance,
{(x,z) | (exists y) parent(x,y) and parent(y,z)} expresses
the grandparent relation as a join followed by a project.
The general form of these set comprehensions is:
{expr(x1,...,xk) | x1 in S1, x2 in S2(x1), ..., xk in Sk(x1,...,xk-1)
| predicate(x1,...,xk)},
where
(1) Si(x1,...,xi-1) is a set-valued function of the variables x1,...,xi-1
(2) the list of membership conditions gives the iterators which
specify the range of values for x1,...,xk
(3) predicate(x1,...,xk) is a filtering condition
(4) expr(x1,...,xk) defines the ``final computation'' to be applied
to the tuples (x1,...,xk) that pass the predicate test
We can easily define the basic map-like functions of scheme with these
set comprehensions.
In fact, the select statement of SQL can be thought of as a restricted
form of set comprehension, since
select xi1,...,xij from table
where predicate(x1,...,xk)
just means {(xi1,....,xij) | (x1,....,xk) in table | predicate(x1,...,xk)}.
I was reading a bit about semi-structured databases,
in the article ``Path Constraints in Semistructured Databases''
by Buneman, Fan and Weinstein (http://db.cis.upenn.edu).
They give a simple example of an object-oriented database:
class student {
Name: string;
Taking: set(course);
}
class course {
CName: string;
Enrolled: set(student);
}
Students: set(student);
Courses: set(course);
and then the query: find the names of all the courses enrolled
by students who are taking the course "Chem3". Expressed in OQL
syntax, the query is:
select distinct c.CName
from Courses c,
c.Enrolled s,
s.Taking c'
where c'.CName = "Chem3"
This extended from of SQL is essentially traversing the graph
represented by the semi-structured data constituted by the objects:
c ranges over Courses, and then for each value of c,
s ranges over the set c.Enrolled, etc.
This extended form of select statement corresponds to a larger sublanguage
of the set-comprehension queries, e.g., the query just mentioned is:
{c.CName | c in Courses, s in c.Enrolled, c' in s.Taking | c.CName = "Chem3"}
Conclusion: set comprehensions are a very general, mathematically
transparent way of expressing queries. Not only queries, but
functional transformations of all sorts...
\end{BACKGROUND}
\begin{QUESTIONS}
The language SETL was designed for just this purpose.
But SETL seems only to have caught on within a small
academic niche, and, furthermore, I don't know of any
freely available implementations. Does anyone know of any?
Guile, on the other hand, has a budding momementum behind it,
and has been designed with many general-purpose considerations in
mind, e.g., it is not committed to a particular syntactic form of
programs.
Main question:
What would be the best way to get these comprehensions working within
a Guile interpreter?
I.e.,
How can we best implement a SETL module within Guile?
(Criteria: clean, easy to read, general, efficient, ...)
\end{QUESTIONS}
\begin{HALF-BAKED IDEAS}
(1) If we had the sources for a SETL-like language, then it could
be meshed with GUILE, through a low-level library.
(2) The C++ standard template library (STL) gives lots of templates for
creating efficient sets, maps, hash objects and what-have-you.
Probably someone has already developed a guile library that puts
wrappers around these functions, so that they can be called
as primitives?
(3) Build a parser that converts the set-comprehension syntax of
SETL to the corresponding scheme expressions, i.e.,
comprehension = a triple consisting of the head expression,
a list of set-valued expressions, and a boolean expression.
(4) have a scheme function eval-comprehen(head,iterators,test)
which returns some kind of set object
(5) Implement the sets and their iterators in C++, using the STL wherever
convenient
(6) The set language allows you to write things like
{x + y | x in {z|z in {1,...,w}|is_prime(z)}, w in {3y,...3y+10},
y in {400,600} | x not-in {t|t in {10,...,20}|is_prime(t)}}
Clearly a contrived example, but it shows the general nesting
possibilities. Thus, it could make sense to represent the
results as set-objects with iterators. The iterators could then
be scheme closures. Then one could use higher-order functional
programming to compose new iterators from other iterators.
This could handle the nesting of set-comprehensions.
\end{HALF-BAKED IDEAS}
Any thoughts?
Many Thanks,
Dave Tanzer
NYU Courant Institute