9.5 Quantifiers

The quantifiers *, +, and ? match respectively: zero or more, one or more, and zero or one instances of the preceding subpattern.

> (regexp-match-positions #rx"c[ad]*r" "cadaddadddr")

'((0 . 11))

> (regexp-match-positions #rx"c[ad]*r" "cr")

'((0 . 2))

> (regexp-match-positions #rx"c[ad]+r" "cadaddadddr")

'((0 . 11))

> (regexp-match-positions #rx"c[ad]+r" "cr")

#f

> (regexp-match-positions #rx"c[ad]?r" "cadaddadddr")

#f

> (regexp-match-positions #rx"c[ad]?r" "cr")

'((0 . 2))

> (regexp-match-positions #rx"c[ad]?r" "car")

'((0 . 3))

In #px syntax, you can use braces to specify much finer-tuned quantification than is possible with *, +, ?:

It is evident that + and ? are abbreviations for {1,} and {0,1} respectively, and * abbreviates {,}, which is the same as {0,}.

> (regexp-match #px"[aeiou]{3}" "vacuous")

'("uou")

> (regexp-match #px"[aeiou]{3}" "evolve")

#f

> (regexp-match #px"[aeiou]{2,3}" "evolve")

#f

> (regexp-match #px"[aeiou]{2,3}" "zeugma")

'("eu")

The quantifiers described so far are all greedy: they match the maximal number of instances that would still lead to an overall match for the full pattern.

> (regexp-match #rx"<.*>" "<tag1> <tag2> <tag3>")

'("<tag1> <tag2> <tag3>")

To make these quantifiers non-greedy, append a ? to them. Non-greedy quantifiers match the minimal number of instances needed to ensure an overall match.

> (regexp-match #rx"<.*?>" "<tag1> <tag2> <tag3>")

'("<tag1>")

The non-greedy quantifiers are *?, +?, ??, {m}?, and {m,n}?, although {m}? is always the same as {m}. Note that the metacharacter ? has two different uses, and both uses are represented in ??.