5 Occurrence Typing
5.1 Basic Occurrence Typing
One of Typed Racket’s distinguishing type system features is occurrence typing, which allows the type system to ascribe more precise types based on whether a predicate check succeeds or fails.
To illustrate, consider the following code:
(: flexible-length (-> (U String (Listof Any)) Integer)) (define (flexible-length str-or-lst) (if (string? str-or-lst) (string-length str-or-lst) (length str-or-lst)))
The flexible-length function above computes the length of either a string or a list. The function body uses the typical Racket idiom of dispatching using a predicate (e.g., string?).
Typed Racket successfully type-checks this function because the type system understands that in the "then" branch of the if expression, the predicate string? must have returned a true value. The type system further knows that if string? returns true, then the str-or-lst variable must have type String and can narrow the type from its original union of String and (Listof Any). This allows the call to string-length in the "then" branch to type-check successfully.
Furthermore, the type system also knows that in the "else" branch of the if expression, the predicate must have returned #f. This implies that the variable str-or-lst must have type (Listof Any) by process of elimination, and thus the call (length str-or-lst) type-checks.
To summarize, if Typed Racket can determine the type a variable must have based on a predicate check in a conditional expression, it can narrow the type of the variable within the appropriate branch of the conditional.
5.2 Propositions and Predicates
In the previous section, we demonstrated that a Typed Racket programmer can take advantage of occurrence typing to type-check functions with union types and conditionals. This may raise the question: how does Typed Racket know how to narrow the type based on the predicate?
The answer is that predicate types in Typed Racket are annotated with logical propositions that tell the typechecker what additional information is gained when a predicate check succeeds or fails.
For example, consider the REPL’s type printout for string?:
- : (-> Any Boolean : String)
The type (-> Any Boolean : String) has three parts. The first two are the same as any other function type and indicate that the predicate takes any value and returns a boolean. The third part, after the :, represents the logical propositions the typechecker learns from the result of applying the function:
If the predicate check succeeds (i.e. produces a non-#f value), the argument variable has type String
If the predicate check fails (i.e. produces #f), the argument variable does not have type String
Predicates for all built-in types are annotated with similar propositions that allow the type system to reason logically about predicate checks.
5.2.1 One-sided Propositions
Sometimes, a predicate may provide information when it succeeds, but not when it fails. For instance, consider this function:
(define (legal-id? s) (and (symbol? s) (not (member s '(cond else if)))))
This function only returns #t when given a symbol, so the type of something that satisfies this predicate can be refined to Symbol.
However, values that fail this predicate can’t be refined to non-symbols; symbols such as 'else also fail to satisfy this predicate.
In cases such as these, it’s possible to provide a proposition that’s applied only to the “positive” assertion. Specifically, this type
... captures the idea that if this predicate returns #t, the argument is known to be a Symbol, without making any claim at all about values for which this predicate returns #f.
There is a negative form as well, which allows types that specify propositions only about values that cause a predicate to return #f.
5.3 Other conditionals and assertions
After all, these control flow constructs macro-expand to if in the end.
So far, we have seen that occurrence typing allows precise reasoning about if expressions. Occurrence typing works for most control flow constructs that are present in Racket such as cond, when, and others.
For example, the flexible-length function from earlier can be re-written to use cond with no additional effort:
(: flexible-length/cond (-> (U String (Listof Any)) Integer)) (define (flexible-length/cond str-or-lst) (cond [(string? str-or-lst) (string-length str-or-lst)] [else (length str-or-lst)]))
In some cases, the type system does not have enough information or is too conservative to typecheck an expression. For example, consider the following interaction:
> (: a Positive-Integer) > (define a 15) > (: b Positive-Integer) > (define b 20) > (: c Positive-Integer) > (define c (- b a))
eval:13:0: Type Checker: type mismatch
In this case, the type system only knows that a and b are positive integers and cannot conclude that their difference will always be positive in defining c. In cases like this, occurrence typing can be used to make the code type-check using an assertion. For example,
(: d Positive-Integer) (define d (assert (- b a) positive?))
Using the logical propositions on positive?, Typed Racket can assign the type Positive-Integer to the whole assert expression. This type-checks, but note that the assertion may raise an exception at run-time if the predicate returns #f.
Note that assert is a derived concept in Typed Racket and is a natural consequence of occurrence typing. The assertion above is essentially equivalent to the following:
(: e Positive-Integer) (define e (let ([diff (- b a)]) (if (positive? diff) diff (error "Assertion failed"))))
5.4 A caveat about set!
If a variable is ever mutated with set! in the scope in which it is defined, Typed Racket cannot use occurrence typing with that variable. This precaution is needed to ensure that concurrent modification of a variable does not invalidate Typed Racket’s knowledge of the type of that variable. Also see Guidelines for Using Assignment.
Furthermore, this means that the types of top-level variables in the REPL cannot be refined by Typed Racket either. This is because the scope of a top-level variable includes future top-level interactions, which may include mutations. It is possible to work around this by moving the variable inside of a module or into a local binding form like let.
5.5 Access to structure fields
Occurrence typing can work with accessors to immutable structure fields.
(struct apple ([a : Any])) (struct (A) fruit ([a : A])) (define (f [obj : Any]) : Number (cond [(and (apple? obj) (number? (apple-a obj))) (apple-a obj)] [(and (fruit? obj) (number? (fruit-a obj))) (fruit-a obj)] [else 42]))
Typed Racket is able to reason about some cases when variables introduced by let-expressions alias other values (e.g. when they alias non-mutated identifiers, car/cdr/struct accesses into immutable values, etc...). This allows programs which explicitly rely on occurrence typing and aliasing to typecheck:
(: f (Any -> Number)) (define (f x) (let ([y x]) (cond [(number? y) x] [(and (pair? y) (number? (car y))) (car x)] [else 42])))
It also allows the typechecker to check programs which use macros that heavily rely on let-bindings internally (such as match):
(: g (Any -> Number)) (define (g x) (match x [(? number?) x] [`(_ _ . ,(? number?)) (cddr x)] [`(_ _ . ,(? pair? p)) (if (number? (caddr x)) (car p) 41)] [_ 42]))