elisp: General Escape Syntax
2.3.3.2 General Escape Syntax
.............................
In addition to the specific escape sequences for special important
control characters, Emacs provides several types of escape syntax that
you can use to specify non-ASCII text characters.
Firstly, you can specify characters by their Unicode values.
‘?\uNNNN’ represents a character with Unicode code point ‘U+NNNN’, where
NNNN is (by convention) a hexadecimal number with exactly four digits.
The backslash indicates that the subsequent characters form an escape
sequence, and the ‘u’ specifies a Unicode escape sequence.
There is a slightly different syntax for specifying Unicode
characters with code points higher than ‘U+FFFF’: ‘?\U00NNNNNN’
represents the character with code point ‘U+NNNNNN’, where NNNNNN is a
six-digit hexadecimal number. The Unicode Standard only defines code
points up to ‘U+10FFFF’, so if you specify a code point higher than
that, Emacs signals an error.
Secondly, you can specify characters by their hexadecimal character
codes. A hexadecimal escape sequence consists of a backslash, ‘x’, and
the hexadecimal character code. Thus, ‘?\x41’ is the character ‘A’,
‘?\x1’ is the character ‘C-a’, and ‘?\xe0’ is the character ‘à’ (‘a’
with grave accent). You can use any number of hex digits, so you can
represent any character code in this way.
Thirdly, you can specify characters by their character code in octal.
An octal escape sequence consists of a backslash followed by up to three
octal digits; thus, ‘?\101’ for the character ‘A’, ‘?\001’ for the
character ‘C-a’, and ‘?\002’ for the character ‘C-b’. Only characters
up to octal code 777 can be specified this way.
These escape sequences may also be used in strings. Non-ASCII
in Strings.