13.11 Sandboxed Evaluation
The racket/sandbox module provides utilities for
creating “sandboxed” evaluators, which are configured in a
particular way and can have restricted resources (memory and time),
filesystem and network access, and much more. Sandboxed evaluators can be
configured through numerous parameters —
| ||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||
input-program : any/c | ||||||||||||||||||||||||||||
requires : (listof (or/c module-path? path?)) | ||||||||||||||||||||||||||||
allow : (listof (or/c module-path? path?)) | ||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||
module-decl : (or/c syntax? pair?) | ||||||||||||||||||||||||||||
lang : (or/c #f module-path?) | ||||||||||||||||||||||||||||
allow : (listof (or/c module-path? path?)) |
The returned evaluator operates in an isolated and limited environment. In particular, filesystem access is restricted. The allow argument extends the set of files that are readable by the evaluator to include the specified modules and their imports (transitively). When language is a module path and when requires is provided, the indicated modules are implicitly included in the allow list.
Each input-program or module-decl argument provides a program in one of the following forms:
an input port used to read the program;
a string or a byte string holding the complete input;
a path that names a file holding the input; or
an S-expression or a syntax object, which is evaluated as with eval (see also get-uncovered-expressions).
In the first three cases above, the program is read using sandbox-reader, with line-counting enabled for sensible error messages, and with 'program as the source (used for testing coverage). In the last case, the input is expected to be the complete program, and is converted to a syntax object (using 'program as the source), unless it already is a syntax object.
The returned evaluator function accepts additional expressions (each time it is called) in essentially the same form: a string or byte string holding a sequence of expressions, a path for a file holding expressions, an S-expression, or a syntax object. If the evaluator receives an eof value, it is terminated and raises errors thereafter. See also kill-evaluator, which terminates the evaluator without raising an exception.
For make-evaluator, multiple input-programs are effectively concatenated to form a single program. The way that the input-programs are evaluated depends on the language argument:
The language argument can be a module path (i.e., a datum that matches the grammar for module-path of require).
In this case, the input-programs are automatically wrapped in a module, and the resulting evaluator works within the resulting module’s namespace.
The language argument can be a list starting with 'special, which indicates a built-in language with special input configuration. The possible values are '(special r5rs) or a value indicating a teaching language: '(special beginner), '(special beginner-abbr), '(special intermediate), '(special intermediate-lambda), or '(special advanced).
In this case, the input-programs are automatically wrapped in a module, and the resulting evaluator works within the resulting module’s namespace. In addition, certain parameters (such as such as read-accept-infix-dot) are set to customize reading programs from strings and ports.
This option is provided mainly for older test systems. Using make-module-evaluator with input starting with #lang is generally better.
Finally, language can be a list whose first element is 'begin.
In this case, a new namespace is created using sandbox-namespace-specs, which by default creates a new namespace using make-base-namespace or make-gui-namespace (depending on gui?).
In the new namespace, language is evaluated as an expression to further initialize the namespace.
The requires list adds additional imports to the module or namespace for the input-programs, even in the case that require is not made available through the language.
The following examples illustrate the difference between an evaluator that puts the program in a module and one that merely initializes a top-level namespace:
| |||
program:1:0: compile: unbound identifier in module in: later | |||
| |||
> (base-module-eval '(f)) | |||
5 | |||
| |||
> (base-top-eval '(+ 1 2)) | |||
3 | |||
> (base-top-eval '(define later 5)) | |||
> (base-top-eval '(f)) | |||
5 |
The make-module-evaluator function is essentially a
restriction of make-evaluator, where the program must be a
module, and all imports are part of the program. In some cases it is
useful to restrict the program to be a module using a spcific module
in its language position —
(define base-module-eval2 |
; equivalent to base-module-eval: |
(make-module-evaluator '(module m racket/base |
(define (f) later) |
(define later 5)))) |
make-module-evaluator can be very convenient for testing module files: all you need to do is pass in a path value for the file name, and you get back an evaluator in the module’s context which you can use with your favorite test facility.
It uses a new custodian and namespace. When gui? is true, it is also runs in its own eventspace.
The evaluator works under the sandbox-security-guard, which restricts file system and network access.
The evaluator is contained in a memory-restricted environment, and each evaluation is wrapped in a call-with-limits (when memory accounting is available); see also sandbox-memory-limit, sandbox-eval-limits and set-eval-limits.
(let ([e (make-evaluator 'racket/base)]) |
(e (,e 1))) |
Evaluation can also be instrumented to track coverage information when sandbox-coverage-enabled is set. Exceptions (both syntax and run-time) are propagated as usual to the caller of the evaluation function (i.e., catch it with with-handlers). However, note that a sandboxed evaluator is convenient for testing, since all exceptions happen in the same way, so you don’t need special code to catch syntax errors.
Finally, the fact that a sandboxed evaluator accept syntax objects makes it usable as the value for current-eval, which means that you can easily start a sandboxed read-eval-print-loop. For example, here is a quick implementation of a networked REPL:
(define e (make-evaluator 'racket/base)) |
(let-values ([(i o) (tcp-accept (tcp-listen 9999))]) |
(parameterize ([current-input-port i] |
[current-output-port o] |
[current-error-port o] |
[current-eval e]) |
(read-eval-print-loop) |
(fprintf o "\nBye...\n") |
(close-output-port o))) |
Note that in this code it is only the REPL interactions that are going
over the network connection; using I/O operations inside the REPL will
still use the usual sandbox parameters (defaulting to no I/O). In
addition, the code works only from an existing toplevel REPL —
(let-values ([(i o) (tcp-accept (tcp-listen 9999))]) |
(parameterize ([current-input-port i] |
[current-output-port o] |
[current-error-port o] |
[sandbox-input i] |
[sandbox-output o] |
[sandbox-error-output o] |
[current-namespace (make-empty-namespace)]) |
(parameterize ([current-eval |
(make-evaluator 'racket/base)]) |
(read-eval-print-loop)) |
(fprintf o "\nBye...\n") |
(close-output-port o))) |
(exn:fail:sandbox-terminated? v) → boolean? |
v : any/c |
(exn:fail:sandbox-terminated-reason exn) → symbol/c |
exn : exn:fail:sandbox-terminated? |
call-with-limits. The resource field holds a symbol, either 'time or 'memory.
13.11.1 Customizing Evaluators
The sandboxed evaluators that make-evaluator creates can be customized via many parameters. Most of the configuration parameters affect newly created evaluators; changing them has no effect on already-running evaluators.
The default configuration options are set for a very restricted
sandboxed environment —
(call-with-trusted-sandbox-configuration thunk) → any |
thunk : (-> any) |
(sandbox-init-hook) → (-> any) |
(sandbox-init-hook thunk) → void? |
thunk : (-> any) |
(sandbox-reader) → (any/c . -> . any) |
(sandbox-reader proc) → void? |
proc : (any/c . -> . any) |
| |||||||||||
(sandbox-input in) → void? | |||||||||||
|
a string or byte string, which is converted to a port using open-input-string or open-input-bytes;
an input port;
the symbol 'pipe, which triggers the creation of a pipe, where put-input can return the output end of the pipe or write directly to it;
a thunk, which is called to obtain a port (e.g., using current-input-port means that the evaluator input is the same as the calling context’s input).
| ||||||||||||
(sandbox-output in) → void? | ||||||||||||
|
an output port, which is used as-is;
the symbol 'bytes, which causes get-output to return the complete output as a byte string;
the symbol 'string, which is similar to 'bytes, but makes get-output produce a string;
the symbol 'pipe, which triggers the creation of a pipe, where get-output returns the input end of the pipe;
a thunk, which is called to obtain a port (e.g., using current-output-port means that the evaluator output is not diverted).
| ||||||||||||
(sandbox-error-output in) → void? | ||||||||||||
|
The default is (lambda () (dup-output-port (current-error-port))), which means that the error output of the generated evaluator goes to the calling context’s error port.
(sandbox-coverage-enabled) → boolean? |
(sandbox-coverage-enabled enabled?) → void? |
enabled? : any/c |
(sandbox-propagate-breaks) → boolean? |
(sandbox-propagate-breaks propagate?) → void? |
propagate? : any/c |
| ||||||||
(sandbox-namespace-specs spec) → void? | ||||||||
|
The default is (list make-base-namespace) if gui? is #f, (list make-gui-namespace) if gui? is #t.
The module paths are needed for sharing module instantiations between the sandbox and the caller. For example, sandbox code that returns posn values (from the lang/posn module) will not be recognized as such by your own code by default, since the sandbox will have its own instance of lang/posn and thus its own struct type for posns. To be able to use such values, include 'lang/posn in the list of module paths.
When testing code that uses a teaching language, the following piece of code can be helpful:
(sandbox-namespace-specs |
(let ([specs (sandbox-namespace-specs)]) |
`(,(car specs) |
,@(cdr specs) |
lang/posn |
,@(if gui? '(mrlib/cache-image-snip) '())))) |
(sandbox-override-collection-paths) → (listof path-string?) |
(sandbox-override-collection-paths paths) → void? |
paths : (listof path-string?) |
(sandbox-security-guard) |
→ (or/c security-guard? (-> security-guard?)) |
(sandbox-security-guard guard) → void? |
guard : (or/c security-guard? (-> security-guard?)) |
(sandbox-path-permissions) | |||||||||
| |||||||||
(sandbox-path-permissions perms) → void? | |||||||||
|
The access mode symbol is one of: 'execute, 'write, 'delete, 'read, or 'exists. These symbols are in decreasing order: each implies access for the following modes too (e.g., 'read allows reading or checking for existence).
The path regexp is used to identify paths that are granted access. It can also be given as a path (or a string or a byte string), which is (made into a complete path, cleansed, simplified, and then) converted to a regexp that allows the path and sub-directories; e.g., "/foo/bar" applies to "/foo/bar/baz".
An additional mode symbol, 'read-bytecode, is not part of the
linear order of these modes. Specifying this mode is similar to
specifying 'read, but it is not implied by any other mode.
(For example, even if you specify 'write for a certain path,
you need to also specify 'read-bytecode to grant this
permission.) The sandbox usually works in the context of a lower code
inspector (see sandbox-make-code-inspector) which prevents
loading of untrusted bytecode files —
The default value is null, but when an evaluator is created, it is augmented by 'read-bytecode permissions that make it possible to use collection libraries (including sandbox-override-collection-paths). See make-evalautor for more information.
(sandbox-network-guard) | |||||||||||
| |||||||||||
(sandbox-network-guard proc) → void? | |||||||||||
|
(sandbox-exit-handler) → (any/c . -> . any) |
(sandbox-exit-handler handler) → void? |
handler : (any/c . -> . any) |
(sandbox-memory-limit) → (or/c nonnegative-number? #f) |
(sandbox-memory-limit limit) → void? |
limit : (or/c nonnegative-number? #f) |
Note that (when memory accounting is enabled) memory is attributed to the highest custodian that refers to it. This means that if you inspect a value that sandboxed evaluation returns outside of the sandbox, your own custodian will be charged for it. To ensure that it is charged back to the sandbox, you should remove references to such values when the code is done inspecting it.
(define e (make-evaluator 'racket/base)) |
(e '(define a 1)) |
(e '(for ([i (in-range 20)]) (set! a (cons (make-bytes 500000) a)))) |
(sandbox-eval-limits) | |||||||||
| |||||||||
(sandbox-eval-limits limits) → void? | |||||||||
|
(parameterize ([sandbox-eval-limits '(0.25 5)]) |
(make-evaluator 'racket/base '(sleep 2))) |
When limits are set, call-with-limits (see below) is wrapped around each use of the evaluator, so consuming too much time or memory results in an exception. Change the limits of a running evaluator using set-eval-limits.
A custodian’s limit is checked only after a garbage collection, except that it may also be checked during certain large allocations that are individually larger than the custodian’s limit.
(for ([i (in-range 1000)]) |
(set! a (cons (make-bytes 1000000) a)) |
(collect-garbage)) |
if a global limit is set but no per-evaluation limit, the sandbox will eventually be terminated and no further evaluations possible;
if there is a per-evaluation limit, but no global limit, the evaluation will abort with an error and it can be used again —
specifically, a will still hold a number of blocks, and you can evaluate the same expression again which will add more blocks to it; if both limits are set, with the global one larger than the per-evaluation limit, then the evaluation will abort and you will be able to repeat it, but doing so several times will eventually terminate the sandbox (this will be indicated by the error message, and by the evaluator-alive? predicate).
(sandbox-eval-handlers) | ||||||||
| ||||||||
(sandbox-eval-handlers handlers) → void? | ||||||||
|
(sandbox-make-inspector) → (-> inspector?) |
(sandbox-make-inspector make) → void? |
make : (-> inspector?) |
(sandbox-make-code-inspector) → (-> inspector?) |
(sandbox-make-code-inspector make) → void? |
make : (-> inspector?) |
(sandbox-make-logger) → (-> logger?) |
(sandbox-make-logger make) → void? |
make : (-> logger?) |
13.11.2 Interacting with Evaluators
The following functions are used to interact with a sandboxed evaluator in addition to using it to evaluate code.
(evaluator-alive? evaluator) → boolean? |
evaluator : (any/c . -> . any) |
(kill-evaluator evaluator) → void? |
evaluator : (any/c . -> . any) |
Killing an evaluator is similar to sending an eof value to the evaluator, except that an eof value will raise an error immediately.
(break-evaluator evaluator) → void? |
evaluator : (any/c . -> . any) |
(get-user-custodian evaluator) → void? |
evaluator : (any/c . -> . any) |
(One use for this custodian is with current-memory-use, where the per-interaction sub-custodians will not be charged with the memory for the whole sandbox.)
(set-eval-limits evaluator secs mb) → void? |
evaluator : (any/c . -> . any) |
secs : (or/c exact-nonnegative-integer? #f) |
mb : (or/c exact-nonnegative-integer? #f) |
This procedure should be used to modify an existing evaluator limits, because changing the sandbox-eval-limits parameter does not affect existing evaluators. See also call-with-limits.
(set-eval-handler evaluator handler) → void? |
evaluator : (any/c . -> . any) |
handler : (or/c #f ((-> any) . -> . any)) |
This procedure should be used to modify an existing evaluator handler, because changing the sandbox-eval-handlers parameter does not affect existing evaluators. See also call-with-custodian-shutdown and call-with-killing-threads for two useful handlers that are provided.
(call-with-custodian-shutdown thunk) → any |
thunk : (-> any) |
(call-with-killing-threads thunk) → any |
thunk : (-> any) |
(put-input evaluator) → output-port? |
evaluator : (any/c . -> . any) |
(put-input evaluator i/o) → void? |
evaluator : (any/c . -> . any) |
i/o : (or/c bytes? string? eof-object?) |
(get-output evaluator) → (or/c #f input-port? bytes? string?) |
evaluator : (any/c . -> . any) |
(get-error-output evaluator) |
→ (or/c #f input-port? bytes? string?) |
evaluator : (any/c . -> . any) |
if it was 'pipe, then get-output returns the input port end of the created pipe;
if it was 'bytes or 'string, then the result is the accumulated output, and the output port is reset so each call returns a different piece of the evaluator’s output (note that any allocations of such output are still subject to the sandbox memory limit);
otherwise, it returns #f.
| |||||||||||||||||||||
evaluator : (any/c . -> . any) | |||||||||||||||||||||
prog? : any/c = #t | |||||||||||||||||||||
src : any/c = default-src |
The prog? argument specifies whether to obtain expressions that were uncovered after only the original input program was evaluated (#t) or after all later uses of the evaluator (#f). Using #t retrieves a list that is saved after the input program is evaluated, and before the evaluator is used, so the result is always the same.
A #t value of prog? is useful for testing student programs to find out whether a submission has sufficient test coverage built in. A #f value is useful for writing test suites for a program to ensure that your tests cover the whole code.
The second optional argument, src, specifies that the result should be filtered to hold only syntax objects whose source matches src. The default is the source that was used in the program code, if there was one. Note that 'program is used as the source value if the input program was given as S-expressions or as a string (and in these cases it will be the default for filtering). If given #f, the result is the unfiltered list of expressions.
The resulting list of syntax objects has at most one expression for each position and span. Thus, the contents may be unreliable, but the position information is reliable (i.e., it always indicates source code that would be painted red in DrRacket when coverage information is used).
Note that if the input program is a sequence of syntax values, either make sure that they have 'program as the source field, or use the src argument. Using a sequence of S-expressions (not syntax objects) for an input program leads to unreliable coverage results, since each expression may be assigned a single source location.
| |||||||||||||||||||||
evaluator : (any/c . -> . any) | |||||||||||||||||||||
thunk : (-> any) | |||||||||||||||||||||
unrestricted? : boolean? = #f |
(let ([guard (current-security-guard)]) |
(call-in-sandbox-context |
(lambda () |
(parameterize ([current-security-guard guard]) |
; can access anything you want here)))) |
13.11.3 Miscellaneous
Various aspects of the racket/sandbox library change when the GUI library is available, such as using a new eventspace for each evaluator.
(call-with-limits secs mb thunk) → any |
secs : (or/c exact-nonnegative-integer? #f) |
mb : (or/c exact-nonnegative-integer? #f) |
thunk : (-> any) |
Sandboxed evaluators use call-with-limits, according to the sandbox-eval-limits setting and uses of set-eval-limits: each expression evaluation is protected from timeouts and memory problems. Use call-with-limits directly only to limit a whole testing session, instead of each expression.
(with-limits sec-expr mb-expr body ...) |
(exn:fail:resource? v) → boolean? |
v : any/c |
(exn:fail:resource-resource exn) → (or/c 'time 'memory) |
exn : exn:fail:resource? |