16.2.1 Syntax Objects

The input and output of a macro transformer (i.e., source and replacement forms) are represented as syntax objects. A syntax object contains symbols, lists, and constant values (such as numbers) that essentially correspond to the quoted form of the expression. For example, a representation of the expression (+ 1 2) contains the symbol '+ and the numbers 1 and 2, all in a list. In addition to this quoted content, a syntax object associates source-location and lexical-binding information with each part of the form. The source-location information is used when reporting syntax errors (for example), and the lexical-binding information allows the macro system to maintain lexical scope. To accommodate this extra information, the represention of the expression (+ 1 2) is not merely '(+ 1 2), but a packaging of '(+ 1 2) into a syntax object.

To create a literal syntax object, use the syntax form:

> (syntax (+ 1 2))

#<syntax:1:0 (+ 1 2)>

In the same way that ' abbreviates quote, #' abbreviates syntax:

> #'(+ 1 2)

#<syntax:1:0 (+ 1 2)>

A syntax object that contains just a symbol is an identifier syntax object. Racket provides some additional operations specific to identifier syntax objects, including the identifier? operation to detect identifiers. Most notably, free-identifier=? determines whether two identifiers refer to the same binding:

> (identifier? #'car)

#t

> (identifier? #'(+ 1 2))

#f

> (free-identifier=? #'car #'cdr)

#f

> (free-identifier=? #'car #'car)

#t

> (require (only-in racket/base [car also-car]))
> (free-identifier=? #'car #'also-car)

#t

To see the lists, symbols, numbers, etc. within a syntax object, use syntax->datum:

> (syntax->datum #'(+ 1 2))

'(+ 1 2)

The syntax-e function is similar to syntax->datum, but it unwraps a single layer of source-location and lexical-context information, leaving sub-forms that have their own information wrapped as syntax objects:

> (syntax-e #'(+ 1 2))

'(#<syntax:1:0 +> #<syntax:1:0 1> #<syntax:1:0 2>)

The syntax-e function always leaves syntax-object wrappers around sub-forms that are represented via symbols, numbers, and other literal values. The only time it unwraps extra sub-forms is when unwrapping a pair, in which case the cdr of the pair may be recursively unwrapped, depending on how the syntax object was constructed.

The opposite of syntax->datum is, of course, datum->syntax. In addition to a datum like '(+ 1 2), datum->syntax needs an existing syntax object to donate its lexical context, and optionally another syntax object to donate its source location:

> (datum->syntax #'lex
                 '(+ 1 2)
                 #'srcloc)

#<syntax:1:0 (+ 1 2)>

In the above example, the lexical context of #'lex is used for the new syntax object, while the source location of #'srcloc is used.

When the second (i.e., the “datum”) argument to datum->syntax includes syntax objects, those syntax objects are preserved intact in the result. That is, deconstructing the result with syntax-e eventually produces the syntax objects that were given to datum->syntax.