elisp: Case Tables
4.9 The Case Table
==================
You can customize case conversion by installing a special “case table”.
A case table specifies the mapping between upper case and lower case
letters. It affects both the case conversion functions for Lisp objects
(see the previous section) and those that apply to text in the buffer
(Case Changes). Each buffer has a case table; there is also a
standard case table which is used to initialize the case table of new
buffers.
A case table is a char-table (Char-Tables) whose subtype is
‘case-table’. This char-table maps each character into the
corresponding lower case character. It has three extra slots, which
hold related tables:
UPCASE
The upcase table maps each character into the corresponding upper
case character.
CANONICALIZE
The canonicalize table maps all of a set of case-related characters
into a particular member of that set.
EQUIVALENCES
The equivalences table maps each one of a set of case-related
characters into the next character in that set.
In simple cases, all you need to specify is the mapping to
lower-case; the three related tables will be calculated automatically
from that one.
For some languages, upper and lower case letters are not in
one-to-one correspondence. There may be two different lower case
letters with the same upper case equivalent. In these cases, you need
to specify the maps for both lower case and upper case.
The extra table CANONICALIZE maps each character to a canonical
equivalent; any two characters that are related by case-conversion have
the same canonical equivalent character. For example, since ‘a’ and ‘A’
are related by case-conversion, they should have the same canonical
equivalent character (which should be either ‘a’ for both of them, or
‘A’ for both of them).
The extra table EQUIVALENCES is a map that cyclically permutes each
equivalence class (of characters with the same canonical equivalent).
(For ordinary ASCII, this would map ‘a’ into ‘A’ and ‘A’ into ‘a’, and
likewise for each set of equivalent characters.)
When constructing a case table, you can provide ‘nil’ for
CANONICALIZE; then Emacs fills in this slot from the lower case and
upper case mappings. You can also provide ‘nil’ for EQUIVALENCES; then
Emacs fills in this slot from CANONICALIZE. In a case table that is
actually in use, those components are non-‘nil’. Do not try to specify
EQUIVALENCES without also specifying CANONICALIZE.
Here are the functions for working with case tables:
-- Function: case-table-p object
This predicate returns non-‘nil’ if OBJECT is a valid case table.
-- Function: set-standard-case-table table
This function makes TABLE the standard case table, so that it will
be used in any buffers created subsequently.
-- Function: standard-case-table
This returns the standard case table.
-- Function: current-case-table
This function returns the current buffer’s case table.
-- Function: set-case-table table
This sets the current buffer’s case table to TABLE.
-- Macro: with-case-table table body...
The ‘with-case-table’ macro saves the current case table, makes
TABLE the current case table, evaluates the BODY forms, and finally
restores the case table. The return value is the value of the last
form in BODY. The case table is restored even in case of an
abnormal exit via ‘throw’ or error (Nonlocal Exits).
Some language environments modify the case conversions of ASCII
characters; for example, in the Turkish language environment, the ASCII
capital I is downcased into a Turkish dotless i (‘ı’). This can
interfere with code that requires ordinary ASCII case conversion, such
as implementations of ASCII-based network protocols. In that case, use
the ‘with-case-table’ macro with the variable ASCII-CASE-TABLE, which
stores the unmodified case table for the ASCII character set.
-- Variable: ascii-case-table
The case table for the ASCII character set. This should not be
modified by any language environment settings.
The following three functions are convenient subroutines for packages
that define non-ASCII character sets. They modify the specified case
table CASE-TABLE; they also modify the standard syntax table.
Syntax Tables. Normally you would use these functions to change the
standard case table.
-- Function: set-case-syntax-pair uc lc case-table
This function specifies a pair of corresponding letters, one upper
case and one lower case.
-- Function: set-case-syntax-delims l r case-table
This function makes characters L and R a matching pair of
case-invariant delimiters.
-- Function: set-case-syntax char syntax case-table
This function makes CHAR case-invariant, with syntax SYNTAX.
-- Command: describe-buffer-case-table
This command displays a description of the contents of the current
buffer’s case table.