1 Datalog Module Language

 #lang datalog package: datalog

In Datalog input, whitespace characters are ignored except when they separate adjacent tokens or when they occur in strings. Comments are also considered to be whitespace. The character % introduces a comment, which extends to the next line break. Comments do not occur inside strings.

A variable is a sequence of Unicode "Uppercase" and "Lowercase" letters, digits, and the underscore character. A variable must begin with a Unicode "Uppercase" letter.

An identifier is a sequence of printing characters that does not contain any of the following characters: (, `, ', ), =, :, ., ~, ?, ", %, and space. An identifier must not begin with a Latin capital letter. Note that the characters that start punctuation are forbidden in identifiers, but the hyphen character is allowed.

A string is a sequence of characters enclosed in double quotes. Characters other than double quote, newline, and backslash may be directly included in a string. The remaining characters may be specified using escape characters, \", \ , and \\ respectively.

A literal, is a predicate symbol followed by an optional parenthesized list of comma separated terms. A predicate symbol is either an identifier or a string. A term is either a variable or a constant. As with predicate symbols, a constant is either an identifier or a string. As a special case, two terms separated by = (!=) is a literal for the equality (inequality) predicate. The following are literals:

    parent(john, douglas)

    zero-arity-literal

    "="(3,3)

    ""(-0-0-0,&&&,***,"\00")

A clause is a head literal followed by an optional body. A body is a comma separated list of literals. A clause without a body is called a fact, and a rule when it has one. The punctuation :- separates the head of a rule from its body. A clause is safe if every variable in its head occurs in some literal in its body. The following are safe clauses:

    parent(john, douglas)

    ancestor(A, B) :-

        parent(A, B)

    ancestor(A, B) :-

        parent(A, C),

        ancestor(C, B)

A program is a sequence of zero or more statements. A statement is an assertion, a retraction, or a query. An assertion is a clause followed by a period, and it adds the clause to the database if it is safe. A retraction is a clause followed by a tilde, and it removes the clause from the database. A query is a literal followed by a question mark.

The following BNF describes the syntax of Datalog.

 

program

 ::= 

statement*

 

statement

 ::= 

assertion

 

  |  

retraction

 

  |  

query

 

assertion

 ::= 

clause .

 

retraction

 ::= 

clause ~

 

query

 ::= 

literal ?

 

clause

 ::= 

literal :- body

 

  |  

literal

 

body

 ::= 

literal , body

 

  |  

literal

 

literal

 ::= 

predicate-sym ( )

 

  |  

predicate-sym ( terms )

 

  |  

predicate-sym

 

  |  

term = term

 

  |  

term != term

 

predicate-sym

 ::= 

IDENTIFIER

 

  |  

STRING

 

terms

 ::= 

term

 

  |  

term , terms

 

term

 ::= 

VARIABLE

 

  |  

constant

 

constant

 ::= 

IDENTIFIER

 

  |  

STRING

The effect of running a Datalog program is to modify the database as directed by its statements, and then to return the literals designated by the query. The modified database is provided as theory.

The following is a program:

         #lang datalog

         edge(a, b). edge(b, c). edge(c, d). edge(d, a).

         path(X, Y) :- edge(X, Y).

         path(X, Y) :- edge(X, Z), path(Z, Y).

         path(X, Y)?

The Datalog REPL accepts new statements that are executed as if they were in the original program text.