1 Datalog Module Language

 #lang datalog package: datalog

In Datalog input, whitespace characters are ignored except when they separate adjacent tokens or when they occur in strings. Comments are also considered to be whitespace. The character % introduces a comment, which extends to the next line break. Comments do not occur inside strings.

A variable is a sequence of Unicode "Uppercase" and "Lowercase" letters, digits, and the underscore character. A variable must begin with a Unicode "Uppercase" letter.

An identifier is a sequence of printing characters that does not contain any of the following characters: (, `, ', ), =, :, ., ~, ?, ", %, and space. An identifier must not begin with a Latin capital letter. Note that the characters that start punctuation are forbidden in identifiers, but the hyphen character is allowed.

A string is a sequence of characters enclosed in double quotes. Characters other than double quote, newline, and backslash may be directly included in a string. The remaining characters may be specified using escape characters, \", \ , and \\ respectively.

A literal is a predicate symbol followed by an optional parenthesized list of comma separated terms, or it is an external query as described below. A predicate symbol is either an identifier or a string. A term is either a variable or a constant. A constant is an identifier, string, integer, or boolean, where booleans are written the same as the identifiers true and false, and integers are written the same as identifiers 0 or those with a nonempty sequence of digits, no leading zero, and optionally prefixed with -. As a special case, two terms separated by = (!=) is a literal for the equality (inequality) predicate. The following are literals:

    parent(john, douglas)

    zero-arity-literal

    "="(3,3)

    ""(-0-0-0,&&&,***,"\00")

    42

A clause is a head literal followed by an optional body. A body is a comma separated list of literals. A clause without a body is called a fact, and a rule when it has one. The punctuation :- separates the head of a rule from its body. A clause is safe if every variable in its head occurs in some literal in its body. The following are safe clauses:

    parent(john, douglas)

    ancestor(A, B) :-

        parent(A, B)

    ancestor(A, B) :-

        parent(A, C),

        ancestor(C, B)

A program is a sequence of zero or more statements. A statement is an assertion, a retraction, a query, or a requirement. An assertion is a clause followed by a ., and it adds the clause to the database if it is safe. A retraction is a clause followed by ~, and it removes the clause from the database. A query is a literal followed by a ?. A requirement is a (, then an identifier, then ), then ., and it imports functions that can be called as external queries.

A external query is a variable, then :-, then an identifier, then a parenthesized list of comma separated terms. Beware that an external query can break Datalog’s termination guarantee.

The following BNF describes the syntax of Datalog.

 

program

 ::= 

statement*

 

statement

 ::= 

assertion

 

  |  

retraction

 

  |  

query

 

  |  

requirement

 

assertion

 ::= 

clause .

 

retraction

 ::= 

clause ~

 

query

 ::= 

literal ?

 

requirement

 ::= 

( IDENTIFIER ) .

 

clause

 ::= 

literal :- body

 

  |  

literal

 

body

 ::= 

literal , body

 

  |  

literal

 

literal

 ::= 

predicate-sym ( )

 

  |  

predicate-sym ( terms )

 

  |  

predicate-sym

 

  |  

term = term

 

  |  

term != term

 

  |  

VARIABLE :- external-sym ( terms )

 

predicate-sym

 ::= 

IDENTIFIER

 

  |  

STRING

 

terms

 ::= 

term

 

  |  

term , terms

 

term

 ::= 

VARIABLE

 

  |  

constant

 

constant

 ::= 

IDENTIFIER

 

  |  

STRING

 

  |  

INTEGER

 

  |  

true  |  false

The effect of running a Datalog program is to modify the database as directed by its statements, and then to return the literals designated by the query. The modified database is provided as theory.

The following is a program:

         #lang datalog

         edge(a, b). edge(b, c). edge(c, d). edge(d, a).

         path(X, Y) :- edge(X, Y).

         path(X, Y) :- edge(X, Z), path(Z, Y).

         path(X, Y)?

Here is a program that uses + and - from racket/base as external queries:

    #lang datalog

    (racket/base).

    

    fib(0, 0).

    fib(1, 1).

    

    fib(N, F) :- N != 1,

                 N != 0,

                 N1 :- -(N, 1),

                 N2 :- -(N, 2),

                 fib(N1, F1),

                 fib(N2, F2),

                 F :- +(F1, F2).

    

    fib(30, F)?

The Datalog REPL accepts new statements that are executed as if they were in the original program text.