16.2.1 Syntax Objects

The input and output of a macro transformer (i.e., source and replacement forms) are represented as syntax objects. A syntax object contains symbols, lists, and constant values (such as numbers) that essentially correspond to the quoted form of the expression. For example, a representation of the expression (+ 1 2) contains the symbol '+ and the numbers 1 and 2, all in a list. In addition to this quoted content, a syntax object associates source-location and lexical-binding information with each part of the form. The source-location information is used when reporting syntax errors (for example), and the lexical-biding information allows the macro system to maintain lexical scope. To accommodate this extra information, the represention of the expression (+ 1 2) is not merely '(+ 1 2), but a packaging of '(+ 1 2) into a syntax object.

To create a literal syntax object, use the syntax form:

  > (syntax (+ 1 2))

  #<syntax:1:0 (+ 1 2)>

In the same way that ' abbreviates quote, #' abbreviates syntax:

  > #'(+ 1 2)

  #<syntax:1:0 (+ 1 2)>

A syntax object that contains just a symbol is an identifier syntax object. Racket provides some additional operations specific to identifier syntax objects, including the identifier? operation to detect identifiers. Most notably, free-identifier=? determines whether two identifiers refer to the same binding:

  > (identifier? #'car)

  #t

  > (identifier? #'(+ 1 2))

  #f

  > (free-identifier=? #'car #'cdr)

  #f

  > (free-identifier=? #'car #'car)

  #t

  > (require (only-in racket/base [car also-car]))
  > (free-identifier=? #'car #'also-car)

  #t

  > (free-identifier=? #'car (let ([car 8])
                               #'car))

  #f

The last example above, in particular, illustrates how syntax objects preserve lexical-context information.

To see the lists, symbols, numbers, etc. within a syntax object, use syntax->datum:

  > (syntax->datum #'(+ 1 2))

  '(+ 1 2)

The syntax-e function is similar to syntax->datum, but it unwraps a single layer of source-location and lexical-context information, leaving sub-forms that have their own information wrapped as syntax objects:

  > (syntax-e #'(+ 1 2))

  '(#<syntax:1:0 +> #<syntax:1:0 1> #<syntax:1:0 2>)

The syntax-e function always leaves syntax-object wrappers around sub-forms that are represented via symbols, numbers, and other literal values. The only time it unwraps extra sub-forms is when unwrapping a pair, in which case the cdr of the pair may be recursively unwrapped, depending on how the syntax object was constructed.

The opposite of syntax->datum is, of course, datum->syntax. In addition to a datum like '(+ 1 2), datum->syntax needs an existing syntax object to donate its lexical context, and optionally another syntax object to donate its source location:

  > (datum->syntax #'lex
                   '(+ 1 2)
                   #'srcloc)

  #<syntax:1:0 (+ 1 2)>

In the above example, the lexical context of #'lex is used for the new syntax object, while the source location of #'srcloc is used.

When the second (i.e., the “datum”) argument to datum->syntax includes syntax objects, those syntax objects are preserved intact in the result. That is, deconstructing the result with syntax-e eventually produces the syntax objects that were given to datum->syntax.