elisp: Syntax Class Table
34.2.1 Table of Syntax Classes
------------------------------
Here is a table of syntax classes, the characters that designate them,
their meanings, and examples of their use.
Whitespace characters: ‘ ’ or ‘-’
Characters that separate symbols and words from each other.
Typically, whitespace characters have no other syntactic
significance, and multiple whitespace characters are syntactically
equivalent to a single one. Space, tab, and formfeed are
classified as whitespace in almost all major modes.
This syntax class can be designated by either ‘ ’ or ‘-’. Both
designators are equivalent.
Word constituents: ‘w’
Parts of words in human languages. These are typically used in
variable and command names in programs. All upper- and lower-case
letters, and the digits, are typically word constituents.
Symbol constituents: ‘_’
Extra characters used in variable and command names along with word
constituents. Examples include the characters ‘$&*+-_<>’ in Lisp
mode, which may be part of a symbol name even though they are not
part of English words. In standard C, the only
non-word-constituent character that is valid in symbols is
underscore (‘_’).
Punctuation characters: ‘.’
Characters used as punctuation in a human language, or used in a
programming language to separate symbols from one another. Some
programming language modes, such as Emacs Lisp mode, have no
characters in this class since the few characters that are not
symbol or word constituents all have other uses. Other programming
language modes, such as C mode, use punctuation syntax for
operators.
Open parenthesis characters: ‘(’
Close parenthesis characters: ‘)’
Characters used in dissimilar pairs to surround sentences or
expressions. Such a grouping is begun with an open parenthesis
character and terminated with a close. Each open parenthesis
character matches a particular close parenthesis character, and
vice versa. Normally, Emacs indicates momentarily the matching
open parenthesis when you insert a close parenthesis.
Blinking.
In human languages, and in C code, the parenthesis pairs are ‘()’,
‘[]’, and ‘{}’. In Emacs Lisp, the delimiters for lists and
vectors (‘()’ and ‘[]’) are classified as parenthesis characters.
String quotes: ‘"’
Characters used to delimit string constants. The same string quote
character appears at the beginning and the end of a string. Such
quoted strings do not nest.
The parsing facilities of Emacs consider a string as a single
token. The usual syntactic meanings of the characters in the
string are suppressed.
The Lisp modes have two string quote characters: double-quote (‘"’)
and vertical bar (‘|’). ‘|’ is not used in Emacs Lisp, but it is
used in Common Lisp. C also has two string quote characters:
double-quote for strings, and apostrophe (‘'’) for character
constants.
Human text has no string quote characters. We do not want
quotation marks to turn off the usual syntactic properties of other
characters in the quotation.
Escape-syntax characters: ‘\’
Characters that start an escape sequence, such as is used in string
and character constants. The character ‘\’ belongs to this class
in both C and Lisp. (In C, it is used thus only inside strings,
but it turns out to cause no trouble to treat it this way
throughout C code.)
Characters in this class count as part of words if
‘words-include-escapes’ is non-‘nil’. Word Motion.
Character quotes: ‘/’
Characters used to quote the following character so that it loses
its normal syntactic meaning. This differs from an escape
character in that only the character immediately following is ever
affected.
Characters in this class count as part of words if
‘words-include-escapes’ is non-‘nil’. Word Motion.
This class is used for backslash in TeX mode.
Paired delimiters: ‘$’
Similar to string quote characters, except that the syntactic
properties of the characters between the delimiters are not
suppressed. Only TeX mode uses a paired delimiter presently—the
‘$’ that both enters and leaves math mode.
Expression prefixes: ‘'’
Characters used for syntactic operators that are considered as part
of an expression if they appear next to one. In Lisp modes, these
characters include the apostrophe, ‘'’ (used for quoting), the
comma, ‘,’ (used in macros), and ‘#’ (used in the read syntax for
certain data types).
Comment starters: ‘<’
Comment enders: ‘>’
Characters used in various languages to delimit comments. Human
text has no comment characters. In Lisp, the semicolon (‘;’)
starts a comment and a newline or formfeed ends one.
Inherit standard syntax: ‘@’
This syntax class does not specify a particular syntax. It says to
look in the standard syntax table to find the syntax of this
character.
Generic comment delimiters: ‘!’
Characters that start or end a special kind of comment. _Any_
generic comment delimiter matches _any_ generic comment delimiter,
but they cannot match a comment starter or comment ender; generic
comment delimiters can only match each other.
This syntax class is primarily meant for use with the
‘syntax-table’ text property (Syntax Properties). You can
mark any range of characters as forming a comment, by giving the
first and last characters of the range ‘syntax-table’ properties
identifying them as generic comment delimiters.
Generic string delimiters: ‘|’
Characters that start or end a string. This class differs from the
string quote class in that _any_ generic string delimiter can match
any other generic string delimiter; but they do not match ordinary
string quote characters.
This syntax class is primarily meant for use with the
‘syntax-table’ text property (Syntax Properties). You can
mark any range of characters as forming a string constant, by
giving the first and last characters of the range ‘syntax-table’
properties identifying them as generic string delimiters.