elisp: Syntax Table Internals
34.7 Syntax Table Internals
===========================
Syntax tables are implemented as char-tables (Char-Tables), but
most Lisp programs don’t work directly with their elements. Syntax
tables do not store syntax data as syntax descriptors (Syntax
Descriptors); they use an internal format, which is documented in this
section. This internal format can also be assigned as syntax properties
(Syntax Properties).
Each entry in a syntax table is a “raw syntax descriptor”: a cons
cell of the form ‘(SYNTAX-CODE . MATCHING-CHAR)’. SYNTAX-CODE is an
integer which encodes the syntax class and syntax flags, according to
the table below. MATCHING-CHAR, if non-‘nil’, specifies a matching
character (similar to the second character in a syntax descriptor).
Here are the syntax codes corresponding to the various syntax
classes:
Code Class Code Class
0 whitespace 8 paired delimiter
1 punctuation 9 escape
2 word 10 character quote
3 symbol 11 comment-start
4 open parenthesis 12 comment-end
5 close parenthesis 13 inherit
6 expression prefix 14 generic comment
7 string quote 15 generic string
For example, in the standard syntax table, the entry for ‘(’ is ‘(4 .
41)’. 41 is the character code for ‘)’.
Syntax flags are encoded in higher order bits, starting 16 bits from
the least significant bit. This table gives the power of two which
corresponds to each syntax flag.
Prefix Flag Prefix Flag
‘1’ ‘(lsh 1 16)’ ‘p’ ‘(lsh 1 20)’
‘2’ ‘(lsh 1 17)’ ‘b’ ‘(lsh 1 21)’
‘3’ ‘(lsh 1 18)’ ‘n’ ‘(lsh 1 22)’
‘4’ ‘(lsh 1 19)’
-- Function: string-to-syntax desc
Given a syntax descriptor DESC (a string), this function returns
the corresponding raw syntax descriptor.
-- Function: syntax-after pos
This function returns the raw syntax descriptor for the character
in the buffer after position POS, taking account of syntax
properties as well as the syntax table. If POS is outside the
buffer’s accessible portion (accessible portion Narrowing.),
the return value is ‘nil’.
-- Function: syntax-class syntax
This function returns the syntax code for the raw syntax descriptor
SYNTAX. More precisely, it takes the raw syntax descriptor’s
SYNTAX-CODE component, masks off the high 16 bits which record the
syntax flags, and returns the resulting integer.
If SYNTAX is ‘nil’, the return value is returns ‘nil’. This is so
that the expression
(syntax-class (syntax-after pos))
evaluates to ‘nil’ if ‘pos’ is outside the buffer’s accessible
portion, without throwing errors or returning an incorrect code.