3.5 Characters
Characters in The Racket Guide introduces characters.
Characters range over Unicode
scalar values, which includes
characters whose values range from #x0 to
#x10FFFF, but not including #xD800 to
#xDFFF. The scalar values are a subset of the Unicode
code points.
Two characters are eqv? if they correspond to the same scalar
value. For each scalar value less than 256, character values that are
eqv? are also eq?.
See Reading Characters
for information on reading
characters and Printing Characters
for information on printing characters.
3.5.1 Characters and Scalar Values
Return #t if v is a character, #f
otherwise.
Returns a character’s code-point number.
Return the character whose code-point number is k. For
k less than 256, the result is the same object for
the same k.
3.5.2 Character Comparisons
Returns
#t if all of the arguments are
eqv?.
Returns #t if the arguments are sorted increasing, where
two characters are ordered by their scalar values, #f
otherwise.
Like
char<?, but checks whether the arguments are nondecreasing.
Like
char<?, but checks whether the arguments are decreasing.
Like
char<?, but checks whether the arguments are nonincreasing.
Returns
#t if all of the arguments are
eqv? after
locale-insensitive case-folding via
char-foldcase.
Like
char<?, but checks whether the arguments would be in
increasing order if each was first case-folded using
char-foldcase (which is locale-insensitive).
Like
char-ci<?, but checks whether the arguments would be nondecreasing after case-folding.
Like
char-ci<?, but checks whether the arguments would be decreasing after case-folding.
Like
char-ci<?, but checks whether the arguments would be nonincreasing after case-folding.
3.5.3 Classifications
Returns #t if char has the Unicode “Alphabetic”
property.
Returns #t if char has the Unicode “Lowercase”
property.
Returns #t if char has the Unicode “Uppercase”
property.
Returns #t if char’s Unicode general category is
Lt, #f otherwise.
Returns #t if char has the Unicode “Numeric”
property.
Returns #t if char’s Unicode general category is
Sm, Sc, Sk, or So, #f otherwise.
Returns #t if char’s Unicode general category is
Pc, Pd, Ps, Pe, Pi, Pf, or
Po, #f otherwise.
Returns
#t if
char’s Unicode general category is
Ll, Lm, Lo, Lt, Lu, Nd, Nl, No,
Mn, Mc, or Me, or if one of the following produces
#t when applied to
char:
char-alphabetic?,
char-numeric?,
char-symbolic?, or
char-punctuation?.
Returns #t if char has the Unicode “White_Space”
property.
Returns #t if char’s Unicode general category is
Zs or if char is #\tab. (These correspond to
horizontal whitespace.)
Return #t if char is between #\nul and
#\u001F inclusive or #\rubout and #\u009F
inclusive.
Returns a symbol representing the character’s Unicode general
category, which is
'lu,
'll,
'lt,
'lm,
'lo,
'mn,
'mc,
'me,
'nd,
'nl,
'no,
'ps,
'pe,
'pi,
'pf,
'pd,
'pc,
'po,
'sc,
'sm,
'sk,
'so,
'zs,
'zp,
'zl,
'cc,
'cf,
'cs,
'co, or
'cn.
Produces a list of three-element lists, where each three-element list
represents a set of consecutive code points for which the Unicode
standard specifies character properties. Each three-element list
contains two integers and a boolean; the first integer is a starting
code-point value (inclusive), the second integer is an ending
code-point value (inclusive), and the boolean is #t when all
characters in the code-point range have identical results for all of
the character predicates above. The three-element lists are ordered in
the overall result list such that later lists represent larger
code-point values, and all three-element lists are separated from
every other by at least one code-point value that is not specified by
Unicode.
3.5.4 Character Conversions
Produces a character consistent with the 1-to-1 code point mapping
defined by Unicode. If
char has no upcase mapping,
char-upcase produces
char.
String procedures, such as string-upcase, handle
the case where Unicode defines a locale-independent mapping from the
code point to a code-point sequence (in addition to the 1-1 mapping on
scalar values).
Like
char-upcase, but for the Unicode downcase mapping.
Like
char-upcase, but for the Unicode titlecase mapping.
Like
char-upcase, but for the Unicode case-folding mapping.