This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

set comprehensions in Guile?


Hi,
Jim Blandy's note about the desirability of connecting query languages 
with Guile struck a chord with me.  I'm doing research on 
query-processing, and was wondering about how to go about using set
theory as a query language, within the guile environment.

There are three sections to this message: 
BACKGROUND, QUESTIONS, and HALF-BAKED IDEAS.
 
\begin{BACKGROUND}

The language of set theory provides a nice formalism for expressing
queries, as is well-known, e.g., the fragment of set theory
called ``relational calculus''.  For instance,
{(x,z) | (exists y) parent(x,y) and parent(y,z)} expresses
the grandparent relation as a join followed by a project.

The general form of these set comprehensions is:
{expr(x1,...,xk) | x1 in S1, x2 in S2(x1), ..., xk in Sk(x1,...,xk-1)
                 | predicate(x1,...,xk)},
where 
   (1) Si(x1,...,xi-1) is a set-valued function of the variables x1,...,xi-1
   (2) the list of membership conditions gives the iterators which
         specify the range of values for x1,...,xk
   (3) predicate(x1,...,xk) is a filtering condition
   (4) expr(x1,...,xk) defines the ``final computation'' to be applied
         to the tuples (x1,...,xk) that pass the predicate test

We can easily define the basic map-like functions of scheme with these
set comprehensions.

In fact, the select statement of SQL can be thought of as a restricted
form of set comprehension, since

select xi1,...,xij from table
where predicate(x1,...,xk)

just means {(xi1,....,xij) | (x1,....,xk) in table | predicate(x1,...,xk)}.

I was reading a bit about semi-structured databases,
in the article ``Path Constraints in Semistructured Databases''
by Buneman, Fan and Weinstein (http://db.cis.upenn.edu).
They give a simple example of an object-oriented database:

class student {
    Name:   string;
    Taking: set(course);
}
class course {
    CName:  string;
    Enrolled: set(student);
}
Students: set(student);
Courses:  set(course);

and then the query:  find the names of all the courses enrolled
by students who are taking the course "Chem3".  Expressed in OQL
syntax, the query is:

select distinct c.CName
from Courses c,
     c.Enrolled s,
     s.Taking c'
where c'.CName = "Chem3"

This extended from of SQL is essentially traversing the graph
represented by the semi-structured data constituted by the objects:
c ranges over Courses, and then for each value of c,
s ranges over the set c.Enrolled, etc.

This extended form of select statement corresponds to a larger sublanguage
of the set-comprehension queries, e.g., the query just mentioned is:

{c.CName | c in Courses, s in c.Enrolled, c' in s.Taking | c.CName = "Chem3"} 

Conclusion:  set comprehensions are a very general, mathematically
transparent way of expressing queries.  Not only queries, but
functional transformations of all sorts...

\end{BACKGROUND}


\begin{QUESTIONS}
 
The language SETL was designed for just this purpose. 
But SETL seems only to have caught on within a small 
academic niche, and, furthermore, I don't know of any
freely available implementations.  Does anyone know of any? 

Guile, on the other hand, has a budding momementum behind it,
and has been designed with many general-purpose considerations in 
mind, e.g., it is not committed to a particular syntactic form of 
programs.
 
Main question: 

    What would be the best way to get these comprehensions working within 
    a Guile interpreter?  
    I.e., 
    How can we best implement a SETL module within Guile? 
    (Criteria:  clean, easy to read, general, efficient, ...) 

\end{QUESTIONS}


\begin{HALF-BAKED IDEAS}

(1) If we had the sources for a SETL-like language, then it could
    be meshed with GUILE, through a low-level library.

(2) The C++ standard template library (STL) gives lots of templates for
    creating efficient sets, maps, hash objects and what-have-you.
    Probably someone has already developed a guile library that puts
    wrappers around these functions, so that they can be called
    as primitives?

(3) Build a parser that converts the set-comprehension syntax of
    SETL to the corresponding scheme expressions, i.e.,
    comprehension = a triple consisting of the head expression,
    a list of set-valued expressions, and a boolean expression.
    
(4) have a scheme function eval-comprehen(head,iterators,test)
    which returns some kind of set object

(5) Implement the sets and their iterators in C++, using the STL wherever
    convenient 

(6) The set language allows you to write things like
    {x + y | x in {z|z in {1,...,w}|is_prime(z)}, w in {3y,...3y+10},
             y in {400,600} | x not-in {t|t in {10,...,20}|is_prime(t)}}
    Clearly a contrived example, but it shows the general nesting
    possibilities.  Thus, it could make sense to represent the
    results as set-objects with iterators.  The iterators could then
    be scheme closures.  Then one could use higher-order functional
    programming to compose new iterators from other iterators.
    This could handle the nesting of set-comprehensions.

\end{HALF-BAKED IDEAS}

Any thoughts?

Many Thanks,
Dave Tanzer
NYU Courant Institute

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]