3 Units of Code

3.1 Organization Matters

We often develop units of code in a bottom-up fashion with some top-down planning. There is nothing surprising about this strategy because we build code atop of existing libraries, which takes some experimentation, which in turn is done in the REPL. We also want testable code quickly, meaning we tend to write down those pieces of code first for which we can develop and run tests. Readers don’t wish to follow our development, however; they wish to understand what the code computes without necessarily understanding all the details.

So, please take the time to present each unit of code in a top-down manner. This starts with the implementation part of a module. Put the important functions close to the top, right below any code and comments as to what kind of data you use. The rule also applies to classes, where you want to expose public methods before you tackle private methods. And the rule applies to units, too.

3.2 Size Matters

Keep units of code small. Keep modules, classes, functions and methods small.

A module of 10,000 lines of code is too large. A module of 1,000 lines is tolerable. A module of 500 lines of code has the right size.

One module should usually contain a class and its auxiliary functions, which in turn determines the length of a good-sized class.

And a function/method/syntax-case of roughly 66 lines is usually acceptable. The 66 is based on the length of a screen with small font. It really means "a screen length." Yes, there are exceptions where functions are more than 1,000 lines long and extremely readable. Nesting levels and nested loops may look fine to you when you write code, but readers will not appreciate it keeping implicit and tangled dependencies in their mind. It really helps the reader to separate functions (with what you may call manual lambda lifting) into a reasonably flat organization of units that fit on a (laptop) screen and explicit dependencies.

For many years we had a limited syntax transformation language that forced people to create huge functions. This is no longer the case, so we should try to stick to the rule whenever possible.

If a unit of code looks incomprehensible, it is probably too large. Break it up. To bring across what the pieces compute, implement or serve, use meaningful names; see Names. If you can’t come up with a good name for such pieces, you are probably looking at the wrong kind of division; consider alternatives.

3.3 Modules and their Interfaces

The purpose of a module is to provide some services:

Equip a module with a short purpose statement.

Often “short” means one line; occasionally you may need several lines.

In order to understand a module’s services, organize the module in three sections below the purpose statement: its exports, its imports, and its implementation:

good
#lang racket/base

; the module implements a tv server

(provide
; launch the tv server function
tv-launch
; set up a tv client to receive messages from the tv server
tv-client)

; —————————————————————————————————
; import and implementation section

(require 2htdp/universe htdp/image)

(define (tv-launch)
(universe ...))

(define (tv-client)
(big-bang ...))

If you choose to use provide with contract-out, you may wish to have two require sections:

the first one, placed with the provide section, imports the values needed to formulate the contracts and
the second one, placed below the provide section, imports the values needed to implement the services.

If your contracts call for additional concepts, define those right below the provide specification:

good
#lang racket/base

; the module implements a tv server

(provide
(contract-out
   ; initialize the board for the given number of players
   [board-init        (-> player#/c plain-board/c)]
   ; initialize a board and place the tiles
   [create-board      (-> player#/c (listof placement/c)
                          (or/c plain-board/c string?))]
   ; create a board from an X-expression representation
   [board-deserialize (-> xexpr? plain-board/c)]))

(require xml)

(define player# 3)
(define plain-board/c
(instanceof/c (and/c admin-board%/c board%-contracts/c)))

(define placement/c
(flat-named-contract "placement" ...))

; —————————————————————————————————
; import and implementation section

(require 2htdp/universe htdp/image)

; implementation:
(define (board-init n)
(new board% ...))

(define (create-board n lop)
(define board (board-init n))
...)

(define board%
(class ... some 900 lines ...))

In the preceding code snippet, xml imports the xexpr? predicate. Since the latter is needed to articulate the contract for board-deserialize, the require line for xml is a part of the provide section. In contrast, the require line below the lines imports an event-handling mechanism plus a simple image manipulation library, and these tools are needed only for the implementation of the provided services.

Prefer specific export specifications over (provide (all-defined-out)).

A test suite section—if located within the module—should come at the very end, including its specific dependencies, i.e., require specifications.

3.3.1 Require

With require specifications at the top of the implementation section, you let every reader know what is needed to understand the module.

3.3.2 Provide

A module’s interface describes the services it provides; its body implements these services. Others have to read the interface if the external documentation doesn’t suffice:

Place the interface at the top of the module.

This helps people find the relevant information quickly.

good
#lang racket

; This module implements
; several strategies.

(provide
; Stgy = State -> Action

; Stgy
; people's strategy
human-strategy

; Stgy
; tree traversal
ai-strategy)

; ———————————
; implementation

(require "basics.rkt")

(define (general p)
  ...)

... some 100 lines ...
(define human-strategy
  (general create-gui))

... some 100 lines ...
(define ai-strategy
  (general traversal))

bad
#lang racket

; This module implements
; several strategies.

; ———————————
; implementation

(require "basics.rkt")

; Stgy = State -> Action

(define (general p)
  ...)
... some 100 lines ...

(provide
; Stgy
; a person's strategy
human-strategy)

(define human-strategy
  (general create-gui))
... some 100 lines ...

(provide
; Stgy
; a tree traversal
ai-strategy)

(define ai-strategy
  (general traversal))
... some 100 lines ...

As you can see from this comparison, an interface shouldn’t just provide a list of names. Each identifier should come with a purpose statement. Type-like explanations of data may also show up in a provide specification so that readers understand what kind of data your public functions work on.

While a one-line purpose statement for a function is usually enough, syntax should come with a description of the grammar clause it introduces and its meaning.

good
#lang racket

(provide
; (define-strategy (s:id a:id b:id c:id d:id)
; action:definition-or-expression)
;
; (define-strategy (s board tiles available score) ...)
; defines a function from an instance of player to a
; placement. The four identifier denote the state of
; the board, the player's hand, the places where a
; tile can be placed, and the player's current score.
define-strategy)

Use provide with contract-out for module interfaces. Contracts often provide the right level of specification for first-time readers.

At a minimum, you should use type-like contracts, i.e., predicates that check for the constructor of data. They cost almost nothing, especially because exported functions tend to check such constraints internally anyway and contracts tend to render such checks superfluous.

If you discover that contracts create a performance bottleneck, please report the problem to the Racket developer mailing list.

3.3.3 Uniformity of Interface

Pick a rule for consistently naming your functions, classes, and methods. Stick to it. For example, you may wish to prefix all exported names with the name of the data type that they deal with, say syntax-local.

Pick a rule for consistently naming and ordering the parameters of your functions and methods. Stick to it. For example, if your module implements an abstract data type (ADT), all functions on the ADT should consume the ADT-argument first or last.

Finally pick the same name for all function/method arguments in a module that refer to the same kind of data—regardless of whether the module implements a common data structure. For example, in "collects/setup/scribble", all functions use latex-dest to refer to the same kind of data, even those that are not exported.

3.3.4 Sections and Sub-modules

Finally, a module consists of sections. It is good practice to separate the sections with comment lines. You may want to write down purpose statements for sections so that readers can easily understand which part of a module implements which service. Alternatively, consider using the large letter chapter headings in DrRacket to label the sections of a module.

With rackunit, test suites can be defined within the module using define/provide-test-suite. If you do so, locate the test section at the end of the module and require the necessary pieces for testing specifically for the test suites.

As of version 5.3, Racket supports sub-modules. Use sub-modules to formulate sections, especially test sections. With sub-modules it is now possible to break up sections into distinct parts (labeled with the same name) and leave it to the language to stitch pieces together.

fahrenheit.rkt
#lang racket

(provide
  (contract-out
    ; convert a fahrenheit temperature to a celsius
    [fahrenheit->celsius (-> number? number?)]))

(define (fahrenheit->celsius f)
  (/ (* 5 (- f 32)) 9))

(module+ test
  (require rackunit)
  (check-equal? (fahrenheit->celsius -40) -40)
  (check-equal? (fahrenheit->celsius 32) 0)
  (check-equal? (fahrenheit->celsius 212) 100))

If you develop your code in DrRacket, it will run the test sub-module every time you click “run” unless you explicitly disable this functionality in the language selection menu. If you have a file and you just wish to run the tests, use raco to do so:

$ raco test fahrenheit.rkt

Running this command in a shell will require and evaluate the test sub-module from the fahrenheit.rkt.

3.4 Classes & Units

(I will write something here sooner or later.)

3.5 Functions & Methods

If your function or method consumes more than two parameters, consider keyword arguments so that call sites can easily be understood. In addition, keyword arguments also “thin” out calls because function calls don’t need to refer to default values of arguments that are considered optional.

Similarly, if your function or method consumes two (or more) optional parameters, keyword arguments are a must.

Write a purpose statement for your function. If you can, add an informal type and/or contract statement.

3.6 Contracts

A contract establishes a boundary between a service provider and a service consumer aka server and client. Due to historical reasons, we tend to refer to this boundary as a module boundary, but the use of "module" in this phrase does not only refer to file-based or physical Racket modules. Clearly, contract boundary is better than module boundary because it separates the two concepts.

When you use provide with contract-out at the module level, the boundary of the physical module and the contract boundary coincide.

When a module becomes too large to manage without contracts but you do not wish to distribute the source over several files, you may wish to use one of the following two constructs to erect contract boundaries internal to the physical module:

define/contract
module, as in submodule.

Using the first one, define/contract, is like using define except that it is also possible to add a contract between the header of the definition and its body. The following code display shows a file that erects three internal contract boundaries: two for plain constants and one for a function.

celsius.rkt
#lang racket

(define/contract AbsoluteC real? -273.15)
(define/contract AbsoluteF real? -459.67)

(define/contract (celsius->fahrenheit c)
; convert a celsius temperature to a fahrenheit temperature
(-> (and/c real? (>=/c AbsoluteC))
(and/c real? (>=/c AbsoluteF)))
; – IN –
(+ (* 9/5 c) 32))

(module+ test
(require rackunit)
(check-equal? (celsius->fahrenheit -40) -40)
(check-equal? (celsius->fahrenheit 0) 32)
(check-equal? (celsius->fahrenheit 100) 212))

To find out how these contract boundaries work, you may wish to conduct some experiments:

Add the following line to the bottom of the file:
(celsius->fahrenheit -300)
Save to file and observe how the contract system blames this line and what the blame report tells you.
Replace the body of the celsius->fahrenheit function with
(sqrt c)
Once again, run the program and study the contract exceptions, in particular observe which party gets blamed.
Change the right-hand side of AbsoluteC to 0-273.15i, i.e., a complex number. This time a different contract party gets blamed.

The screen shot below shows that define/contract works for mutually recursive functions with modules. This capability is unique to define/contract.

Mutually recursive functions with contracts

In contrast, submodules act exactly like plain modules when it comes to contract boundaries. Like define/contract, a submodule establishes a contract boundary between itself and the rest of the module. Any value flow between a client module and the submodule is governed by contracts. Any value flow within the submodule is free of any constraints.

graph-traversal.rkt
#lang racket
...
(module traversal racket
  (provide
   (contract-out
    (find-path (-> graph? node? node? (option/c path?)))))

  (require (submod ".." graph) (submod ".." contract))

  (define (find-path G s d (visited history0))
    (cond
      [(node=? s d) '()]
      [(been-here? s visited) #f]
      [else (define neighbors (node-neighbors G s))
            (define there (record s visited))
            (define path (find-path* G neighbors d there))
            (if path (cons s path) #f)]))

  (define (find-path* G s* d visited)
    (cond
      [(empty? s*) #f]
      [else (or (find-path G (first s*) d visited)
                (find-path* G (rest s*) d visited))]))

  (define (node-neighbors G n)
    (rest (assq n G))))

(module+ test
  (require (submod ".." traversal) (submod ".." graph))
  (find-path G 'a 'd))

Since modules and submodules cannot refer to each other in a mutual recursive fashion, submodule contract boundaries cannot enforce constraints on mutually recursive functions. It would thus be impossible to distribute the find-path and find-path* functions from the preceding code display into two distinct submodules.

top ← prev up next →

1	Basic Facts of Life
2	Testing
3	Units of Code
4	Choosing the Right Construct
5	Scribbling Documentation
6	Textual Matters
7	Language and Performance
8	Retiquette: Branch and Commit
9	Acknowledgment
10	Todo List, Call for Contributions

3.1	Organization Matters
3.2	Size Matters
3.3	Modules and their Interfaces
3.4	Classes & Units
3.5	Functions & Methods
3.6	Contracts