6 Text Generation
The
scribble/text language
provides everything from
racket/base with a few changes that
make it suitable as a text generation or a preprocessor language:
The language uses read-syntax-inside to read the body
of the module, similar to Document Reader. This means that
by default, all text is read in as Racket strings; and
@-forms can be used to use Racket
functions and expression escapes.
Values of expressions are printed with a custom output
function. This function displays most values in a similar way
to display, except that it is more convenient for a
textual output.
6.1 Writing Text Files
The combination of the two features makes text in files in the
scribble/text language be read as strings, which get printed
out when the module is required, for example, when a file is
given as an argument to racket. (In these example the left
part shows the source input, and the right part the printed result.)
| #lang scribble/text | Programming languages should | be designed not by piling | feature on top of feature, but | blah blah blah. |
|
| → | Programming languages should | be designed not by piling | feature on top of feature, but | blah blah blah. |
|
Using @-forms, we can define and use Racket
functions.
| #lang scribble/text | @(require racket/list) | @(define Foo "Preprocessing") | @(define (3x . x) | ;; racket syntax here | (add-between (list x x x) " ")) | @Foo languages should | be designed not by piling | feature on top of feature, but | @3x{blah}. |
|
| → | Preprocessing languages should | be designed not by piling | feature on top of feature, but | blah blah blah. |
|
As demonstrated in this case, the output function simply
scans nested list structures recursively, which makes them convenient
for function results. In addition, output prints most values
similarly to display — notable exceptions are void and
false values which cause no output to appear. This can be used for
convenient conditional output.
| #lang scribble/text | @(define (errors n) | (list n | " error" | (and (not (= n 1)) "s"))) | You have @errors[3] in your code, | I fixed @errors[1]. |
|
| → | You have 3 errors in your code, | I fixed 1 error. |
|
Using the scribble @-forms syntax, you can write
functions more conveniently too.
| #lang scribble/text | @(define (errors n) | ;; note the use of `unless' | @list{@n error@unless[(= n 1)]{s}}) | You have @errors[3] in your code, | I fixed @errors[1]. |
|
| → | You have 3 errors in your code, | I fixed 1 error. |
|
Following the details of the scribble reader, you may notice that in
these examples there are newline strings after each definition, yet
they do not show in the output. To make it easier to write
definitions, newlines after definitions and indentation spaces before
them are ignored.
| #lang scribble/text |
| @(define (plural n) | (unless (= n 1) "s")) |
| @(define (errors n) | @list{@n error@plural[n]}) |
| You have @errors[3] in your code, | @(define fixed 1) | I fixed @errors[fixed]. |
|
| → | You have 3 errors in your code, | I fixed 1 error. |
|
These end-of-line newline strings are not ignored when they follow
other kinds of expressions, which may lead to redundant empty lines in
the output.
| #lang scribble/text | @(define (count n str) | (for/list ([i (in-range 1 (add1 n))]) | @list{@i @str,@"\n"})) | Start... | @count[3]{Mississippi} | ... and I'm done. |
|
| → | Start... | 1 Mississippi, | 2 Mississippi, | 3 Mississippi, |
| ... and I'm done. |
|
There are several ways to avoid having such empty lines in your
output. The simplest way is to arrange for the function call’s form
to end right before the next line begins, but this is often not too
convenient. An alternative is to use a @; comment, which
makes the scribble reader ignore everything that follows it up to and
including the newline. (These methods can be applied to the line that
precedes the function call too, but the results are likely to have
what looks like erroneous indentation. More about this below.)
| #lang scribble/text | @(define (count n str) | (for/list ([i (in-range 1 (+ n 1))]) | @list{@i @str,@"\n"})) | Start... | @count[3]{Mississippi | }... done once. |
| Start again... | @count[3]{Massachusetts}@; | ... and I'm done again. |
|
| → | Start... | 1 Mississippi, | 2 Mississippi, | 3 Mississippi, | ... done once. |
| Start again... | 1 Massachusetts, | 2 Massachusetts, | 3 Massachusetts, | ... and I'm done again. |
|
A better approach is to generate newlines only when needed.
| #lang scribble/text | @(require racket/list) | @(define (counts n str) | (add-between | (for/list ([i (in-range 1 (+ n 1))]) | @list{@i @str,}) | "\n")) | Start... | @counts[3]{Mississippi} | ... and I'm done. |
|
| → | Start... | 1 Mississippi, | 2 Mississippi, | 3 Mississippi, | ... and I'm done. |
|
In fact, this is common enough that the scribble/text
language provides a convenient facility: add-newlines is a
function that is similar to add-between using a newline
string as the default separator, except that false and void values are
filtered out before doing so.
| #lang scribble/text | @(define (count n str) | (add-newlines | (for/list ([i (in-range 1 (+ n 1))]) | @list{@i @str,}))) | Start... | @count[3]{Mississippi} | ... and I'm done. |
|
| → | Start... | 1 Mississippi, | 2 Mississippi, | 3 Mississippi, | ... and I'm done. |
|
| #lang scribble/text | @(define (count n str) | (add-newlines | (for/list ([i (in-range 1 (+ n 1))]) | @(and (even? i) @list{@i @str,})))) | Start... | @count[6]{Mississippi} | ... and I'm done. |
|
| → | Start... | 2 Mississippi, | 4 Mississippi, | 6 Mississippi, | ... and I'm done. |
|
The separator can be set to any value.
| #lang scribble/text | @(define (count n str) | (add-newlines #:sep ",\n" | (for/list ([i (in-range 1 (+ n 1))]) | @list{@i @str}))) | Start... | @count[3]{Mississippi}. | ... and I'm done. |
|
| → | Start... | 1 Mississippi, | 2 Mississippi, | 3 Mississippi. | ... and I'm done. |
|
6.2 Defining Functions and More
(Note: most of the tips in this section are applicable to any code
that uses the Scribble @-form syntax.)
Because the Scribble reader is uniform, you can use it in place of any
expression where it is more convenient. (By convention, we use a
plain S-expression syntax when we want a Racket expression escape, and
an @-form for expressions that render as text, which, in the
scribble/text language, is any value-producing expression.)
For example, you can use an @-form for a function that you define.
| #lang scribble/text | @(define @bold[text] @list{*@|text|*}) | An @bold{important} note. |
|
| → | |
This is not commonly done, since most functions that operate with text
will need to accept a variable number of arguments. In fact, this
leads to a common problem: what if we want to write a function that
consumes a number of “text arguments” rathen than a single
“rest-like” body? The common solution for this is to provide the
separate text arguments in the S-expression part of an @-form.
| #lang scribble/text | @(define (choose 1st 2nd) | @list{Either @1st, or @|2nd|@"."}) | @(define who "us") | @choose[@list{you're with @who} | @list{against @who}] |
|
| → | Either you're with us, or against us. |
|
You can even use @-forms with a Racket quote or quasiquote as the
“head” part to make it shorter, or use a macro to get grouping of
sub-parts without dealing with quotes.
| #lang scribble/text | @(define (choose 1st 2nd) | @list{Either @1st, or @|2nd|@"."}) | @(define who "us") | @choose[@list{you're with @who} | @list{against @who}] | @(define-syntax-rule (compare (x ...) ...) | (add-newlines | (list (list "* " x ...) ...))) | Shopping list: | @compare[@{apples} | @{oranges} | @{@(* 2 3) bananas}] |
|
| → | Either you're with us, or against us. | Shopping list: | * apples | * oranges | * 6 bananas |
|
Yet another solution is to look at the text values and split the input
arguments based on a specific token. Using match can make it
convenient — you can even specify the patterns with @-forms.
| #lang scribble/text | @(require racket/match) | @(define (features . text) | (match text | [@list{@|1st|@... | --- | @|2nd|@...} | @list{>> Pros << | @1st; | >> Cons << | @|2nd|.}])) | @features{fast, | reliable | --- | expensive, | ugly} |
|
| → | >> Pros << | fast, | reliable; | >> Cons << | expensive, | ugly. |
|
In particular, it is often convenient to split the input by lines,
identified by delimiting "\n" strings. Since this can be
useful, a split-lines function is provided.
| #lang scribble/text | @(require racket/list) | @(define (features . text) | (add-between (split-lines text) | ", ")) | @features{red | fast | reliable}. |
|
| → | |
Finally, the Scribble reader accepts any expression as the head
part of an @-form — even an @ form. This makes it possible to
get a number of text bodies by defining a curried function, where each
step accepts any number of arguments. This, however, means that the
number of body expressions must be fixed.
| #lang scribble/text | @(define ((choose . 1st) . 2nd) | @list{Either you're @1st, or @|2nd|.}) | @(define who "me") | @@choose{with @who}{against @who} |
|
| → | Either you're with me, or against me. |
|
6.3 Using Printouts
Because the text language simply displays each toplevel value as the
file is run, it is possible to print text directly as part of the
output.
| #lang scribble/text | First | @display{Second} | Third |
|
| → | |
Taking this further, it is possible to write functions that output
some text instead of returning values that represent the text.
| #lang scribble/text | @(define (count n) | (for ([i (in-range 1 (+ n 1))]) | (printf "~a Mississippi,\n" i))) | Start... | @count[3]@; avoid an empty line | ... and I'm done. |
|
| → | Start... | 1 Mississippi, | 2 Mississippi, | 3 Mississippi, | ... and I'm done. |
|
This can be used to produce a lot of output text, even infinite.
| #lang scribble/text | @(define (count n) | (printf "~a Mississippi,\n" n) | (count (add1 n))) | Start... | @count[1] | this line is never printed! |
|
| → | Start... | 1 Mississippi, | 2 Mississippi, | 3 Mississippi, | 4 Mississippi, | 5 Mississippi, | ... |
|
However, you should be careful not to mix returning values with
printouts, as the results are rarely desirable.
| #lang scribble/text | @list{1 @display{two} 3} |
|
| → | |
Note that you don’t need side-effects if you want infinite output.
The output function iterates thunks and (composable)
promises, so you can create a loop that is delayed in either form.
| #lang scribble/text | @(define (count n) | (cons @list{@n Mississippi,@"\n"} | (lambda () | (count (add1 n))))) | Start... | @count[1] | this line is never printed! |
|
| → | Start... | 1 Mississippi, | 2 Mississippi, | 3 Mississippi, | 4 Mississippi, | 5 Mississippi, | ... |
|
6.4 Indentation in Preprocessed output
An issue that can be very important in many text generation applications
is the indentation of the output. This can be crucial in some cases, if
you’re generating code for an indentation-sensitive language (e.g.,
Haskell, Python, or C preprocessor directives). To get a better
understanding of how the pieces interact, you may want to review how the
Scribble reader section, but also remember that
you can use quoted forms to see how some form is read.
| #lang scribble/text | @(format "~s" '@list{ | a | b | c}) |
|
| → | (list "a" "\n" " " "b" "\n" "c") |
|
The Scribble reader ignores indentation spaces in its body. This is an
intentional feature, since you usually do not want an expression to
depend on its position in the source. But the question is whether we
can render some output text with proper indentation. The
output function achieves that by introducing blocks.
Just like a list, a block contains a list of elements, and when
one is rendered, it is done in its own indentation level. When a
newline is part of a block’s contents, it causes the following
text to appear with indentation that corresponds to the column position
at the beginning of the block.
In addition, lists are also rendered as blocks by default, so they can
be used for the same purpose. In most cases, this makes the output
appear “as intended” where lists are used for nested pieces of text
— either from a literal list expression, or an expression
that evaluates to a list, or when a list is passed on as a value; either
as a toplevel expression, or as a nested value; either appearing after
spaces, or after other output.
| #lang scribble/text | foo @block{1 | 2 | 3} | foo @list{4 | 5 | 6} |
|
| → | |
| #lang scribble/text | @(define (code . text) | @list{begin | @text | end}) | @code{first | second | @code{ | third | fourth} | last} |
|
| → | begin | first | second | begin | third | fourth | end | last | end |
|
| #lang scribble/text | @(define (enumerate . items) | (add-newlines #:sep ";\n" | (for/list ([i (in-naturals 1)] | [item (in-list items)]) | @list{@|i|. @item}))) | Todo: @enumerate[@list{Install Racket} | @list{Hack, hack, hack} | @list{Profit}]. |
|
| → | Todo: 1. Install Racket; | 2. Hack, hack, hack; | 3. Profit. |
|
There are, however, cases when you need more refined control over the
output. The scribble/text language provides a few functions
for such cases in addition to block. The splice
function groups together a number of values but avoids introducing a new
indentation context. Furthermore, lists are not always rendered as
blocks — instead, they are rendered as splices when
they are used inside one, so you essentially use splice to
avoid the “indentation group” behavior, and block to restore
it.
| #lang scribble/text | @(define (blah . text) | @splice{{ | blah(@block{@text}); | }}) | start | @splice{foo(); | loop:} | @list{if (something) @blah{one, | two}} | end |
|
| → | start | foo(); | loop: | if (something) { | blah(one, | two); | } | end |
|
The disable-prefix function disables all indentation
printouts in its contents, including the indentation before the body
of the disable-prefix value itself. It is useful, for
example, to print out CPP directives.
| #lang scribble/text | @(define (((IFFOO . var) . expr1) . expr2) | (define (array e1 e2) | @list{[@e1, | @e2]}) | @list{var @var; | @disable-prefix{#ifdef FOO} | @var = @array[expr1 expr2]; | @disable-prefix{#else} | @var = @array[expr2 expr1]; | @disable-prefix{#endif}}) |
| function blah(something, something_else) { | @disable-prefix{#include "stuff.inc"} | @@@IFFOO{i}{something}{something_else} | } |
|
| → | function blah(something, something_else) { | #include "stuff.inc" | var i; | #ifdef FOO | i = [something, | something_else]; | #else | i = [something_else, | something]; | #endif | } |
|
If there are values after a disable-prefix value on the same
line, they will get indented to the goal column (unless the
output is already beyond it).
| #lang scribble/text | @(define (thunk name . body) | @list{function @name() { | @body | }}) | @(define (ifdef cond then else) | @list{@disable-prefix{#}ifdef @cond | @then | @disable-prefix{#}else | @else | @disable-prefix{#}endif}) |
| @thunk['do_stuff]{ | init(); | @ifdef["HAS_BLAH" | @list{var x = blah();} | @thunk['blah]{ | @ifdef["BLEHOS" | @list{@disable-prefix{#}@; | include <bleh.h> | bleh();} | @list{error("no bleh");}] | }] | more_stuff(); | } |
|
| → | function do_stuff() { | init(); | # ifdef HAS_BLAH | var x = blah(); | # else | function blah() { | # ifdef BLEHOS | # include <bleh.h> | bleh(); | # else | error("no bleh"); | # endif | } | # endif | more_stuff(); | } |
|
There are cases where each line should be prefixed with some string
other than a plain indentation. The add-prefix function
causes its contents to be printed using some given string prefix for
every line. The prefix gets accumulated to an existing indentation,
and indentation in the contents gets added to the prefix.
| #lang scribble/text | @(define (comment . body) | @add-prefix["// "]{@body}) | @comment{add : int int -> string} | char *foo(int x, int y) { | @comment{ | skeleton: | allocate a string | print the expression into it | @comment{...more work...} | } | char *buf = malloc(@comment{FIXME! | This is bad} | 100); | } |
|
| → | // add : int int -> string | char *foo(int x, int y) { | // skeleton: | // allocate a string | // print the expression into it | // // ...more work... | char *buf = malloc(// FIXME! | // This is bad | 100); | } |
|
When combining add-prefix and disable-prefix there
is an additional value that can be useful: flush. This is a
value that causes output to print the current indentation and
prefix. This makes it possible to get the “ignored as a prefix”
property of disable-prefix but only for a nested prefix.
| #lang scribble/text | @(define (comment . text) | (list flush | @add-prefix[" *"]{ | @disable-prefix{/*} @text */})) | function foo(x) { | @comment{blah | more blah | yet more blah} | if (x < 0) { | @comment{even more | blah here | @comment{even | nested}} | do_stuff(); | } | } |
|
| → | function foo(x) { | /* blah | * more blah | * yet more blah */ | if (x < 0) { | /* even more | * blah here | * /* even | * * nested */ */ | do_stuff(); | } | } |
|
6.5 Using External Files
Using additional files that contain code for your preprocessing is
trivial: the source text is still source code in a module, so you can
require additional files with utility functions.
| #lang scribble/text | @(require "itemize.rkt") | Todo: | @itemize[@list{Hack some} | @list{Sleep some} | @list{Hack some | more}] |
| itemize.rkt: | #lang racket | (provide itemize) | (define (itemize . items) | (add-between (map (lambda (item) | (list "* " item)) | items) | "\n")) |
|
| → | Todo: | * Hack some | * Sleep some | * Hack some | more |
|
Note that the at-exp language can
often be useful here, since such files need to deal with texts. Using
it, it is easy to include a lot of textual content.
| #lang scribble/text | @(require "stuff.rkt") | Todo: | @itemize[@list{Hack some} | @list{Sleep some} | @list{Hack some | more}] | @summary |
| stuff.rkt: | #lang at-exp racket/base | (require racket/list) | (provide (all-defined-out)) | (define (itemize . items) | (add-between (map (lambda (item) | @list{* @item}) | items) | "\n")) | (define summary | @list{If that's not enough, | I don't know what is.}) |
|
| → | Todo: | * Hack some | * Sleep some | * Hack some | more | If that's not enough, | I don't know what is. |
|
Of course, the extreme side of this will be to put all of your content
in a plain Racket module, using @-forms for convenience. However,
there is no need to use the text language in this case; instead, you can
(require scribble/text), which will get all of the bindings
that are available in the scribble/text language. Using
output, switching from a preprocessed files to a Racket file is
very easy —- choosing one or the other depends on whether it is more
convenient to write a text file with occasional Racket expressions or
the other way.
| #lang at-exp racket/base | (require scribble/text racket/list) | (define (itemize . items) | (add-between (map (lambda (item) | @list{* @item}) | items) | "\n")) | (define summary | @list{If that's not enough, | I don't know what is.}) | (output | @list{ | Todo: | @itemize[@list{Hack some} | @list{Sleep some} | @list{Hack some | more}] | @summary | }) |
|
| → | Todo: | * Hack some | * Sleep some | * Hack some | more | If that's not enough, | I don't know what is. |
|
However, you might run into a case where it is desirable to include a
mostly-text file from a scribble/text source file. It might be
because you prefer to split the source text to several files, or because
you need to use a template file that cannot have a #lang
header (for example, an HTML template file that is the result of an
external editor). In these cases, the scribble/text language
provides an include form that includes a file in the
preprocessor syntax (where the default parsing mode is text).
| #lang scribble/text | @(require racket/list) | @(define (itemize . items) | (list | "<ul>" | (add-between | (map (lambda (item) | @list{<li>@|item|</li>}) | items) | "\n") | "</ul>")) | @(define title "Todo") | @(define summary | @list{If that's not enough, | I don't know what is.}) |
| @include["template.html"] |
| template.html: | <html> | <head><title>@|title|</title></head> | <body> | <h1>@|title|</h1> | @itemize[@list{Hack some} | @list{Sleep some} | @list{Hack some | more}] | <p><i>@|summary|</i></p> | </body> | </html> |
|
| → | <html> | <head><title>Todo</title></head> | <body> | <h1>Todo</h1> | <ul><li>Hack some</li> | <li>Sleep some</li> | <li>Hack some | more</li></ul> | <p><i>If that's not enough, | I don't know what is.</i></p> | </body> | </html> |
|
(Using require with a text file in the scribble/text
language will not work as intended: the language will display the text
is when the module is invoked, so the required file’s contents will be
printed before any of the requiring module’s text does. If you find
yourself in such a situation, it is better to switch to a
Racket-with-@-expressions file as shown above.)