4.6 Characters
Characters in The Racket Guide introduces characters.
Characters range over Unicode scalar values, which includes characters whose values range from #x0 to #x10FFFF, but not including #xD800 to #xDFFF. The scalar values are a subset of the Unicode code points.
Two characters are eqv? if they correspond to the same scalar value. For each scalar value less than 256, character values that are eqv? are also eq?. Characters produced by the default reader are interned in read-syntax mode.
See Reading Characters for information on reading characters and Printing Characters for information on printing characters.
Changed in version 6.1.1.8 of package base: Updated from Unicode 5.0.1 to Unicode 7.0.0.
4.6.1 Characters and Scalar Values
procedure
(char->integer char) → exact-integer?
char : char?
> (char->integer #\A) 65
procedure
(integer->char k) → char?
k :
(and/c exact-integer? (or/c (integer-in 0 #xD7FF) (integer-in #xE000 #x10FFFF)))
> (integer->char 65) #\A
procedure
(char-utf-8-length char) → (integer-in 1 6)
char : char?
4.6.2 Character Comparisons
Changed in version 7.0.0.13 of package base: Allow one argument, in addition to allowing two or more.
Changed in version 7.0.0.13 of package base: Allow one argument, in addition to allowing two or more.
Changed in version 7.0.0.13 of package base: Allow one argument, in addition to allowing two or more.
Changed in version 7.0.0.13 of package base: Allow one argument, in addition to allowing two or more.
Changed in version 7.0.0.13 of package base: Allow one argument, in addition to allowing two or more.
Changed in version 7.0.0.13 of package base: Allow one argument, in addition to allowing two or more.
Changed in version 7.0.0.13 of package base: Allow one argument, in addition to allowing two or more.
procedure
(char-ci<=? char1 char2 ...) → boolean?
char1 : char? char2 : char?
> (char-ci<=? #\A #\a) #t
> (char-ci<=? #\a #\A) #t
> (char-ci<=? #\a #\b #\b) #t
Changed in version 7.0.0.13 of package base: Allow one argument, in addition to allowing two or more.
Changed in version 7.0.0.13 of package base: Allow one argument, in addition to allowing two or more.
procedure
(char-ci>=? char1 char2 ...) → boolean?
char1 : char? char2 : char?
> (char-ci>=? #\A #\a) #t
> (char-ci>=? #\a #\A) #t
> (char-ci>=? #\c #\b #\b) #t
Changed in version 7.0.0.13 of package base: Allow one argument, in addition to allowing two or more.
4.6.3 Classifications
procedure
(char-alphabetic? char) → boolean?
char : char?
procedure
(char-lower-case? char) → boolean?
char : char?
procedure
(char-upper-case? char) → boolean?
char : char?
procedure
(char-title-case? char) → boolean?
char : char?
procedure
(char-numeric? char) → boolean?
char : char?
procedure
(char-symbolic? char) → boolean?
char : char?
procedure
(char-punctuation? char) → boolean?
char : char?
procedure
(char-graphic? char) → boolean?
char : char?
procedure
(char-whitespace? char) → boolean?
char : char?
procedure
(char-blank? char) → boolean?
char : char?
procedure
(char-iso-control? char) → boolean?
char : char?
procedure
(char-extended-pictographic? char) → boolean?
char : char?
Added in version 8.6.0.1 of package base.
procedure
(char-general-category char) → symbol?
char : char?
procedure
(char-grapheme-break-property char) → ?
char : char?
Added in version 8.6.0.1 of package base.
procedure
→
(listof (list/c exact-nonnegative-integer? exact-nonnegative-integer? boolean?))
4.6.4 Character Conversions
procedure
(char-upcase char) → char?
char : char?
String procedures, such as string-upcase, handle the case where Unicode defines a locale-independent mapping from the code point to a code-point sequence (in addition to the 1-1 mapping on scalar values).
> (char-upcase #\a) #\A
> (char-upcase #\λ) #\Λ
> (char-upcase #\space) #\space
procedure
(char-downcase char) → char?
char : char?
> (char-downcase #\A) #\a
> (char-downcase #\Λ) #\λ
> (char-downcase #\space) #\space
procedure
(char-titlecase char) → char?
char : char?
> (char-upcase #\a) #\A
> (char-upcase #\λ) #\Λ
> (char-upcase #\space) #\space
procedure
(char-foldcase char) → char?
char : char?
> (char-foldcase #\A) #\a
> (char-foldcase #\Σ) #\σ
> (char-foldcase #\ς) #\σ
> (char-foldcase #\space) #\space
4.6.5 Character Grapheme-Cluster Streaming
procedure
(char-grapheme-step char state) →
boolean? fixnum? char : char? state : fixnum?
A value of 0 for state represents the initial state or a state where no characters are pending toward a new boundary. Thus, if a sequence of characters is exhausted and accumulated state is not 0, then the end of the stream creates one last grapheme-cluster boundary. When char-grapheme-step produces a true value as its first result and a non-0 value as its second result, then the given char must be the only character pending toward the next grapheme cluster (by the rules of Unicode grapheme clustering).
The char-grapheme-step procedure will produce a result for any fixnum state, but the meaning of a non-0 state is specified only in that providing such a state produced by char-grapheme-step in another call to char-grapheme-step continues detecting grapheme-cluster boundaries in the sequence.
See also string-grapheme-span and string-grapheme-count.
> (char-grapheme-step #\a 0)
#f
1
> (let*-values ([(consumed? state) (char-grapheme-step #\a 0)] [(consumed? state) (char-grapheme-step #\b state)]) (values consumed? state))
#t
1
> (let*-values ([(consumed? state) (char-grapheme-step #\return 0)] [(consumed? state) (char-grapheme-step #\newline state)]) (values consumed? state))
#t
0
> (let*-values ([(consumed? state) (char-grapheme-step #\a 0)] [(consumed? state) (char-grapheme-step #\u300 state)]) (values consumed? state))
#f
5
Added in version 8.6.0.2 of package base.