The html library provides functions to read html documents and structures to represent them.
Reads (X)HTML from a port, producing an html instance.
Reads HTML from a port, producing an X-expression compatible with the xml library (which defines content/c).
If v is not #f, then comments are read and returned. Defaults to #f.
If v is not #f, then the HTML must respect the HTML specification with regards to what elements are allowed to be the children of other elements. For example, the top-level "<html>" element may only contain a "<body>" and "<head>" element. Defaults to #f.
(module html-example racket ; Some of the symbols in html and xml conflict with ; each other and with racket/base language, so we prefix ; to avoid namespace conflict. (require (prefix-in h: html) (prefix-in x: xml)) (define an-html (h:read-xhtml (open-input-string (string-append "<html><head><title>My title</title></head><body>" "<p>Hello world</p><p><b>Testing</b>!</p>" "</body></html>")))) ; extract-pcdata: html-content/c -> (listof string) ; Pulls out the pcdata strings from some-content. (define (extract-pcdata some-content) (cond [(x:pcdata? some-content) (list (x:pcdata-string some-content))] [(x:entity? some-content) (list)] [else (extract-pcdata-from-element some-content)])) ; extract-pcdata-from-element: html-element -> (listof string) ; Pulls out the pcdata strings from an-html-element. (define (extract-pcdata-from-element an-html-element) (match an-html-element [(struct h:html-full (attributes content)) (apply append (map extract-pcdata content))] [(struct h:html-element (attributes)) '()])) (printf "~s\n" (extract-pcdata an-html)))
> (require 'html-example)
("My title" "Hello world" "Testing" "!")
A html-content/c is either
|attributes : (listof attribute)|
Any of the structures below inherits from html-element.
|content : (listof html-content/c)|
Any html tag that may include content also inherits from html-full without adding any additional fields.
A mzscheme is special legacy value for the old documentation system.
A Contents-of-html is either
A Contents-of-head is either
A Contents-of-tr is either
A Contents-of-table is either
A Contents-of-fieldset is either
A Contents-of-select is either
A Contents-of-dl is either
A Contents-of-pre is either
A Contents-of-object-applet is either
A Contents-of-map is either
A Contents-of-a is either
A Contents-of-address is either
A Contents-of-body is either
A G12 is either
A G11 is either
A G10 is either
A G9 is either
A G8 is either
A G7 is either
A G6 is either
A G5 is either
A G4 is either
A G3 is either
A G2 is either