4.1 Patterns

8.15

4.1 Patterns🔗ℹ

This section covers Redex’s pattern language, which is used in many of Redex’s forms. Patterns are matched against terms, which are represented as S-expressions.

Pattern matching uses a cache—including caching the results of side-conditions—so after a pattern has matched a given term, Redex assumes that the pattern will always match the term.

In the following grammar, literal identifiers (such as any) are matched symbolically, as opposed to using the identifier’s lexical binding:

pattern	=	any
	\|	_
	\|	number
	\|	natural
	\|	integer
	\|	real
	\|	string
	\|	boolean
	\|	variable
	\|	(variable-except id ...)
	\|	(variable-prefix id)
	\|	variable-not-otherwise-mentioned
	\|	hole
	\|	symbol
	\|	(name id pattern)
	\|	(in-hole pattern pattern)
	\|	(hide-hole pattern)
	\|	(side-condition pattern guard-expr)
	\|	(compatible-closure-context id)
	\|	(compatible-closure-context id #:wrt id)
	\|	(cross id)
	\|	(pattern-sequence ...)
	\|	other-literal

pattern-sequence	=	pattern
	\|	... ; literal ellipsis
	\|	..._id

The any pattern matches any term. This pattern may also be suffixed with an underscore and another identifier, in which case a match binds the full name (as if it were an implicit name pattern) and match the portion before the underscore.
The _ pattern matches any term, but does not bind _ as a name, nor can it be suffixed to bind a name.
The number pattern matches any number.
The number identifier can be suffixed with an underscore and additional characters, in which case the pattern binds the full name (as if it were an implicit name pattern) when matching the portion before the underscore. For example, the pattern
number_1
matches the same as number, but it also binds the identifier number_1 to the matching portion of a term.
When the same underscore suffix is used for multiple instances if number within a larger pattern, then the overall pattern matches only when all of the instances match the same number.
The natural pattern matches any exact non-negative integer. Like number, this pattern can be suffixed with an underscore and additional characters to create a binding.
The integer pattern matches any exact integer. Like number, this pattern can be suffixed with an underscore and additional characters to create a binding.
The real pattern matches any real number. Like number, this pattern can be suffixed with an underscore and additional characters to create a binding.
The string pattern matches any string. Like number, this pattern can be suffixed with an underscore and additional characters to create a binding.
The boolean pattern matches #true and #false (which are the same as #t and #f, respectively). Like number, this pattern can be suffixed with an underscore and additional characters to create a binding.
The variable pattern matches any symbol. Like number, this pattern can be suffixed with an underscore and additional characters to create a binding.
The variable-except pattern matches any symbol except those listed in its argument. This pattern is useful for ensuring that reserved words in the language are not accidentally captured by variables.
The variable-prefix pattern matches any symbol that begins with the given prefix.
The variable-not-otherwise-mentioned pattern matches any symbol except those that are used as literals elsewhere in the language.
The hole pattern matches anything when inside the first argument to an in-hole pattern. Otherwise, it matches only a hole.
The symbol pattern stands for a literal symbol that must match exactly, unless it is the name of a non-terminal in a relevant language or contains an underscore.
If symbol is a non-terminal, it matches any of the right-hand sides of the non-terminal. If the non-terminal appears twice in a single pattern, then the match is constrained to expressions that are the same, unless the pattern is part of a define-language definition or a contract (e.g., in define-metafunction, define-judgment-form, or define-relation) in which case there is no constraint. Also, the non-terminal will be bound in the expression in any surrounding side-condition patterns unless there the pattern is in a define-language definition.
If symbol is a non-terminal followed by an underscore, for example e_1, it is implicitly the same as a name pattern that matches only the non-terminal, (name e_1 e) for the example. Accordingly, repeated uses of the same name are constrained to match the same expression.
If the symbol is a non-terminal followed by _!_, for example e_!_1, it is also treated as a pattern, but repeated uses of the same pattern are constrained to be different. For example, this pattern:
(e_!_1 e_!_1 e_!_1)
matches lists of three es, but where all three of them are distinct.
If the _!_ is used under the ellipsis then the ellipsis is effectively ignored while checking to see if the es are different. For example, the pattern (e_!_1 ...) matches any sequence of es, as long as they are all distinct. Also, unlike e_1 patterns, the nesting depth of _!_ patterns do not have to be the same. For example, this pattern:
(e_!_1 ... e_!_1)
matches all sequences of es that have at least one element, as long as they are all distinct.
Unlike a _ pattern, the _!_ patterns do not bind names.
If _ names and _!_ are mixed, they are treated as separate. That is, this pattern (e_1 e_!_1) matches just the same things as (e e), but the second doesn’t bind any variables.
If the symbol otherwise has an underscore, it is an error.
The pattern (name id pattern) matches pattern and binds using it to the name id.
The (in-hole pattern pattern) pattern matches the first pattern, looking for a way to decompose the term such that the second pattern matches at some sub-expression where the hole appears while matching the first pattern.
The first pattern must be a pattern that matches with exactly one hole.
The (hide-hole pattern) pattern matches what the embedded pattern matches but if the pattern matcher is looking for a decomposition, it ignores any holes found in that pattern.
The (side-condition pattern guard-expr) pattern matches what the embedded pattern matches, and then guard-expr is evaluated. If guard-expr produces #f, the pattern fails to match, otherwise the pattern matches. Any occurrences of name in the pattern (including those implicitly present via _ patterns) are bound using term-let in guard-expr.
The (compatible-closure-context nt) pattern matches context that correspond to where the compatible closure of a relation would match. More precisely, it is a context whose shape follows the definition of nt, but allowing for a hole at each place where the definition of nt refers to itself.
For example, with this language definition:
(define-language L
  (e ::= (λ (x) e) (e e) x)
  (C ::= (λ (x) C) (C e) (e C) hole)
  (x ::= variable-not-otherwise-mentioned))
the pattern (compatible-closure-context e) is equivalent to the pattern C.
The (compatible-closure-context nt1 #:wrt nt2) pattern similarly is a context, but it decomposes terms matching the non-terminal nt1, placing a hole at each place where an nt2 non-terminal appears.
For example, with this language definition:
(define-language L
  (e ::= v (e e) x)
  (v ::= (λ (x) e))
  (C ::= V (C e) (e C) hole)
  (V ::= (λ (x) C))
  (x ::= variable-not-otherwise-mentioned))
the pattern (compatible-closure-context v #:wrt e) is equivalent to the pattern V and the pattern the pattern (compatible-closure-context e #:wrt e) is equivalent to the pattern C. More generally, leaving off the #:wrt argument is the same as using the same non-terminal twice.
The (cross nt) pattern is an unfortunately-named version of compatible-closure-context that exists for backward compatibility and does not support #:wrt.
The (pattern-sequence ...) pattern matches a term list, where each pattern-sequence element matches an element of the list. In addition, if a list pattern contains an ellipsis, the ellipsis is not treated as a literal, instead it matches any number of duplicates of the pattern that came before the ellipses (including 0). Furthermore, each (name symbol pattern) in the duplicated pattern binds a list of matches to symbol, instead of a single match. (A nested duplicated pattern creates a list of list matches, etc.) Ellipses may be placed anywhere inside the row of patterns, except in the first position or immediately after another ellipses.
Multiple ellipses are allowed. For example, this pattern:
((name x a) ... (name y a) ...)
matches this term:
(term (a a))
three different ways. One where the first a in the pattern matches nothing, and the second matches both of the occurrences of a, one where each named pattern matches a single a and one where the first matches both and the second matches nothing.
If the ellipses is named (i.e., has an underscore and a name following it, like a variable may), the pattern matcher records the length of the list and ensures that any other occurrences of the same named ellipses must have the same length.
As an example, this pattern:
((name x a) ..._1 (name y a) ..._1)
only matches this term:
(term (a a))
one way, with each named pattern matching a single a. Unlike the above, the two patterns with mismatched lengths is ruled out, due to the underscores following the ellipses.
Also, like underscore patterns above, if an underscore pattern begins with ..._!_, then the lengths must be different.
Thus, with the pattern:
((name x a) ..._!_1 (name y a) ..._!_1)
and the expression
(term (a a))
two matches occur, one where x is bound to '() and y is bound to '(a a) and one where x is bound to '(a a) and y is bound to '().
The other-literal pattern stands for a literal value—such as a number, boolean, or string—that must match exactly.

Changed in version 1.8 of package redex-lib: Non-terminals are syntactically classified as either always producing exactly one hole or may produce some other number of holes, and the first argument to in-hole is allowed to accept only patterns that produce exactly one hole.
Changed in version 1.15: Added compatible-closure-context

syntax
(redex-match lang pattern term-expr)
(redex-match lang pattern)

If redex-match is given a term-expr, it matches the pattern (in the language) against the result of term-expr. The result is #f or a list of match structures describing the matches (see match? and match-bindings).

If redex-match has only a lang and pattern, the result is a procedure for efficiently testing whether terms match the pattern with respect to the language lang. The procedure accepts a single term and returns #f or a list of match structures describing the matches.

Examples:

> (define-language nums
    (AE number
        (+ AE AE)))
> (redex-match nums
               (+ AE_1 AE_2)
               (term (+ (+ 1 2) 3)))
(list (match (list (bind 'AE_1 '(+ 1 2)) (bind 'AE_2 3))))
> (redex-match nums
               (+ AE_1 (+ AE_2 AE_3))
               (term (+ (+ 1 2) 3)))
#f
> (redex-match nums
               (+ AE_1 AE_1)
               (term (+ (+ 1 2) 3)))
#f

syntax
(redex-match? lang pattern any)
(redex-match? lang pattern)

Like redex-match, but returns only a boolean indicating whether the match was successful.

Examples:

> (define-language nums
    (AE number
        (+ AE AE)))
> (redex-match? nums
                (+ AE_1 AE_2)
                (term (+ (+ 1 2) 3)))
#t
> (redex-match? nums
                (+ AE_1 AE_1)
                (term (+ (+ 1 2) 3)))
#f

procedure
(match? val) → boolean?
val : any/c

Determines whether a value is a match structure.

procedure
(match-bindings m) → (listof bind?)
m : match?

Returns a list of bind structs that binds the pattern variables in this match.

struct
(struct bind (name exp)
    #:extra-constructor-name make-bind)
  name : symbol?
  exp : any/c

Instances of this struct are returned by redex-match. Each bind associates a name with an s-expression from the language, or a list of such s-expressions if the corresponding name clause is followed by an ellipsis. Nested ellipses produce nested lists.

parameter
(caching-enabled?) → boolean?
(caching-enabled? on?) → void?
on? : boolean?

When this parameter is #t (the default), Redex caches the results of pattern matching, metafunction, and judgment-form evaluation. There is a separate cache for each pattern, metafunction, and judgment-form; when one fills (see set-cache-size!), Redex evicts all of the entries in that cache.

Caching should be disabled when matching a pattern that depends on values other than the in-scope pattern variables or evaluating a metafunction or judgment-form that reads or writes mutable external state.

Changed in version 1.6 of package redex-lib: Extended caching to cover judgment forms.

procedure
(set-cache-size! size) → void?
size : positive-integer?

Changes the size of the per-pattern, per-metafunction and per-judgment-form caches.

The default size is 63.

parameter
(check-redundancy) → boolean?
(check-redundancy check?) → void?
check? : boolean?

Ambiguous patterns can slow down Redex’s pattern matching implementation significantly. To help debug such performance issues, set the check-redundancy parameter to #t. A true value causes Redex to, at runtime, report any redundant matches that it encounters.

Changed in version 1.9 of package redex-lib: Corrected spelling error, from check-redudancy to check-redundancy

1	Amb: A Redex Tutorial
2	Long Tutorial
3	Extended Exercises
4	The Redex Reference
5	Automated Testing Benchmark
	Bibliography
	Index

4.1	Patterns
4.2	Terms
4.3	Languages
4.4	Reduction Relations
4.5	Other Relations
4.6	Testing
4.7	GUI
4.8	Typesetting