The html library provides functions to read html documents and structures to represent them.
port : input-port?
port : input-port?
Reads (X)HTML from a port, producing an html instance.
Reads HTML from a port, producing a list of XML content, each of which could be turned into an X-expression, if necessary, with xml->xexpr.
If v is not #f, then comments are read and returned. Defaults to #f.
If v is not #f, then the HTML must respect the HTML specification with regards to what elements are allowed to be the children of other elements. For example, the top-level "<html>" element may only contain a "<body>" and "<head>" element. Defaults to #t.
(module html-example racket ; Some of the symbols in html and xml conflict with ; each other and with racket/base language, so we prefix ; to avoid namespace conflict. (require (prefix-in h: html) (prefix-in x: xml)) (define an-html (h:read-xhtml (open-input-string (string-append "<html><head><title>My title</title></head><body>" "<p>Hello world</p><p><b>Testing</b>!</p>" "</body></html>")))) ; extract-pcdata: html-content/c -> (listof string) ; Pulls out the pcdata strings from some-content. (define (extract-pcdata some-content) (cond [(x:pcdata? some-content) (list (x:pcdata-string some-content))] [(x:entity? some-content) (list)] [else (extract-pcdata-from-element some-content)])) ; extract-pcdata-from-element: html-element -> (listof string) ; Pulls out the pcdata strings from an-html-element. (define (extract-pcdata-from-element an-html-element) (match an-html-element [(struct h:html-full (attributes content)) (apply append (map extract-pcdata content))] [(struct h:html-element (attributes)) '()])) (printf "~s\n" (extract-pcdata an-html)))
> (require 'html-example)
("My title" "Hello world" "Testing" "!")
A html-content/c is either
(struct html-element (attributes) #:extra-constructor-name make-html-element) attributes : (listof attribute)
Any of the structures below inherits from html-element.
(struct html-full struct:html-element (content) #:extra-constructor-name make-html-full) content : (listof html-content/c)
Any html tag that may include content also inherits from html-full without adding any additional fields.
A mzscheme is special legacy value for the old documentation system.
A Contents-of-html is either
A Contents-of-head is either
A Contents-of-tr is either
A Contents-of-table is either
A Contents-of-fieldset is either
A Contents-of-select is either
A Contents-of-dl is either
A Contents-of-pre is either
A Contents-of-object-applet is either
A Contents-of-map is either
A Contents-of-a is either
A Contents-of-address is either
A Contents-of-body is either
A G12 is either
A G11 is either
A G10 is either
A G9 is either
A G8 is either
A G7 is either
A G6 is either
A G5 is either
A G4 is either
A G3 is either
A G2 is either