Exercise 1.2.4
Characters
Strings are specialized vectors of element-type character, which is a fancy way to say that the type of strings in Lisp descends from sequences, and can only contain character objects in a flat, one-dimensional array. In Lisp, these character elements are atomic, self-evaluating objects in their own right, and have a distinct syntax that is not always the same as the glyph that may be used to represent them in a string.
Characters are referred to as "unitary tokens" in the Standard, and there are actually two kinds of character objects in Common Lisp---graphic characters and non-graphic characters. Only the graphic character objects are associated with a glyph---all other characters are required by the ANSI Standard to have a name. Now the space character has two representations that are equal, one named, and one that uses the space character---so even though it could be lumped with the other non-graphic characters, it is specifically defined as a graphic character object because it does have a glyph associated with it: the empty glyph.
At the REPL, you can use the Sharpsign-Backquote syntax to refer to a literal character object. Most of the characters that you can type with your keymap set to US-English can be entered using the sharpsign-backslash followed by the character itself. Some characters also have names so that they are easier to type:
#\a
#\A
#\Space
#\Newline
#\Tab
In the code above, you can see both kinds of characters objects being used, graphic and non-graphic. Notice that some character names are case-sensitive. Typing #\a
gives you the lower-case letter a character object, while typing #\A
gives you the upper-case letter A character object, just as you would expect---but this behaviour does differ from other symbols, and even from other implementation-specific unicode character representations which we'll see in the next exercise.
As it turns out, you can easily get a list of characters from a string. Try this out:
(coerce "hello, multiverse!" 'list)
What You Should See
* #\a
#\a
* #\A
#\A
* #\Space
#\
* #\Newline
#\Newline
* #\Tab
#\Tab
* (coerce "hello, multiverse!" 'list)
(#\h #\e #\l #\l #\o #\, #\ #\m #\u #\l #\t #\i #\v #\e #\r #\s #\e #\!)
Did you notice that when you enter the #\Space
character at the REPL, it returns the graphic representation of it and not the named? That is the expected behaviour---but it's generally considered easier to identify in source code when you use the named representation.