1.1 Introduction
This section provides an introduction to writing robust macros with syntax-parse and syntax classes.
As a running example we use the following task: write a macro named mylet that has the same syntax and behavior as Racket’s let form. The macro should produce good error messages when used incorrectly.
Here is the specification of mylet’s syntax:
(mylet ([var-id rhs-expr] ...) body ...+) (mylet loop-id ([var-id rhs-expr] ...) body ...+)
For simplicity, we handle only the first case for now. We return to the second case later in the introduction.
The macro can be implemented very simply using define-syntax-rule:
> (define-syntax-rule (mylet ([var rhs] ...) body ...) ((lambda (var ...) body ...) rhs ...))
When used correctly, the macro works, but it behaves very badly in the presence of errors. In some cases, the macro merely fails with an uninformative error message; in others, it blithely accepts illegal syntax and passes it along to lambda, with strange consequences:
> (mylet ([a 1] [b 2]) (+ a b)) 3
> (mylet (b 2) (sub1 b)) mylet: use does not match pattern: (mylet ((var rhs) ...)
body ...) at: (mylet (b 2) (sub1 b))
> (mylet ([1 a]) (add1 a)) lambda: not an identifier, identifier with default, or
keyword at: 1
> (mylet ([#:x 1] [y 2]) (* x y)) eval:2:0: arity mismatch;
the expected number of arguments does not match the given
number
expected: 0 plus an argument with keyword #:x
given: 2
arguments...:
1
2
These examples of illegal syntax are not to suggest that a typical programmer would make such mistakes attempting to use mylet. At least, not often, not after an initial learning curve. But macros are also used by inexpert programmers and as targets of other macros (or code generators), and many macros are far more complex than mylet. Macros must validate their syntax and report appropriate errors. Furthermore, the macro writer benefits from the machine-checked specification of syntax in the form of more readable, maintainable code.
We can improve the error behavior of the macro by using syntax-parse. First, we import syntax-parse into the transformer environment, since we will use it to implement a macro transformer.
> (require (for-syntax syntax/parse))
The following is the syntax specification above transliterated into a syntax-parse macro definition. It behaves no better than the version using define-syntax-rule above.
> (define-syntax (mylet stx) (syntax-parse stx [(_ ([var-id rhs-expr] ...) body ...+) #'((lambda (var-id ...) body ...) rhs-expr ...)]))
One minor difference is the use of ...+ in the pattern; ... means match zero or more repetitions of the preceding pattern; ...+ means match one or more. Only ... may be used in the template, however.
The first step toward validation and high-quality error reporting is annotating each of the macro’s pattern variables with the syntax class that describes its acceptable syntax. In mylet, each variable must be an identifier (id for short) and each right-hand side must be an expr (expression). An annotated pattern variable is written by concatenating the pattern variable name, a colon character, and the syntax class name.For an alternative to the “colon” syntax, see the ~var pattern form.
> (define-syntax (mylet stx) (syntax-parse stx [(_ ((var:id rhs:expr) ...) body ...+) #'((lambda (var ...) body ...) rhs ...)]))
> (mylet ([a #:whoops]) 1) mylet: expected expression at: #:whoops
To allow syntax-parse to synthesize better errors, we must attach descriptions to the patterns we recognize as discrete syntactic categories. One way of doing that is by defining new syntax classes:Another way is the ~describe pattern form.
> (define-syntax (mylet stx) (define-syntax-class binding #:description "binding pair" (pattern (var:id rhs:expr))) (syntax-parse stx [(_ (b:binding ...) body ...+) #'((lambda (b.var ...) body ...) b.rhs ...)]))
Note that we write b.var and b.rhs now. They are the nested attributes formed from the annotated pattern variable b and the attributes var and rhs of the syntax class binding.
> (mylet (["a" 1]) (+ a 2)) mylet: expected identifier
parsing context:
while parsing binding pair at: "a"
> (define-syntax (mylet stx) (define-syntax-class binding #:description "binding pair" (pattern (var:id rhs:expr))) (syntax-parse stx [(_ (b:binding ...) body ...+) #:fail-when (check-duplicate-identifier (syntax->list #'(b.var ...))) "duplicate variable name" #'((lambda (b.var ...) body ...) b.rhs ...)]))
> (define-syntax (mylet stx) (define-syntax-class binding #:description "binding pair" (pattern (var:id rhs:expr))) (define-syntax-class distinct-bindings #:description "sequence of distinct binding pairs" (pattern (b:binding ...) #:fail-when (check-duplicate-identifier (syntax->list #'(b.var ...))) "duplicate variable name" #:with (var ...) #'(b.var ...) #:with (rhs ...) #'(b.rhs ...))) (syntax-parse stx [(_ bs:distinct-bindings . body) #'((lambda (bs.var ...) . body) bs.rhs ...)]))
Alas, so far the macro only implements half of the functionality offered by Racket’s let. We must add the “named-let” form. That turns out to be as simple as adding a new clause:
> (define-syntax (mylet stx) (define-syntax-class binding #:description "binding pair" (pattern (var:id rhs:expr))) (define-syntax-class distinct-bindings #:description "sequence of distinct binding pairs" (pattern (b:binding ...) #:fail-when (check-duplicate-identifier (syntax->list #'(b.var ...))) "duplicate variable name" #:with (var ...) #'(b.var ...) #:with (rhs ...) #'(b.rhs ...))) (syntax-parse stx [(_ bs:distinct-bindings body ...+) #'((lambda (bs.var ...) body ...) bs.rhs ...)] [(_ loop:id bs:distinct-bindings body ...+) #'(letrec ([loop (lambda (bs.var ...) body ...)]) (loop bs.rhs ...))]))
> (mylet ([a 1] [b 2]) (+ a b)) 3
> (mylet (["a" 1]) (add1 a)) mylet: expected identifier
parsing context:
while parsing binding pair
while parsing sequence of distinct binding pairs at: "a"
> (mylet ([a #:whoops]) 1) mylet: expected expression
parsing context:
while parsing binding pair
while parsing sequence of distinct binding pairs at:
#:whoops
> (mylet ([a 1 2]) (* a a)) mylet: unexpected term
parsing context:
while parsing binding pair
while parsing sequence of distinct binding pairs at: 2
> (mylet (a 1) (+ a 2)) mylet: expected binding pair
parsing context:
while parsing sequence of distinct binding pairs at: a
> (mylet ([a 1] [a 2]) (+ a a)) mylet: duplicate variable name
parsing context:
while parsing sequence of distinct binding pairs at: a
> (mylet loop ([a 1] [b 2]) (+ a b)) 3
> (mylet loop (["a" 1]) (add1 a)) mylet: expected identifier
parsing context:
while parsing binding pair
while parsing sequence of distinct binding pairs at: "a"
> (mylet loop ([a #:whoops]) 1) mylet: expected expression
parsing context:
while parsing binding pair
while parsing sequence of distinct binding pairs at:
#:whoops
> (mylet loop ([a 1 2]) (* a a)) mylet: unexpected term
parsing context:
while parsing binding pair
while parsing sequence of distinct binding pairs at: 2
> (mylet loop (a 1) (+ a 2)) mylet: expected binding pair
parsing context:
while parsing sequence of distinct binding pairs at: a
> (mylet loop ([a 1] [a 2]) (+ a a)) mylet: duplicate variable name
parsing context:
while parsing sequence of distinct binding pairs at: a
How does syntax-parse decide which clause the programmer was attempting, so it can use it as a basis for error reporting? After all, each of the bad uses of the named-let syntax are also bad uses of the normal syntax, and vice versa. And yet the macro doen not produce errors like “mylet: expected sequence of distinct binding pairs at: loop.”
The answer is that syntax-parse records a list of all the potential errors (including ones like loop not matching distinct-binding) along with the progress made before each error. Only the error with the most progress is reported.
> (mylet loop (["a" 1]) (add1 a)) mylet: expected identifier
parsing context:
while parsing binding pair
while parsing sequence of distinct binding pairs at: "a"
> (mylet (["a" 1]) (add1 a)) mylet: expected identifier
parsing context:
while parsing binding pair
while parsing sequence of distinct binding pairs at: "a"
> (mylet ([a 1] [a 2]) (+ a a)) mylet: duplicate variable name
parsing context:
while parsing sequence of distinct binding pairs at: a
> (mylet "not-even-close") mylet: expected identifier or expected sequence of distinct
binding pairs at: "not-even-close"
> (mylet) mylet: expected more terms at: (mylet)
This section has provided an introduction to syntax classes, side conditions, and progress-ordered error reporting. But syntax-parse has many more features. Continue to the Examples section for samples of other features in working code, or skip to the subsequent sections for the complete reference documentation.