emacs-mime: Charset Translation
2.5 Charset Translation
=======================
During translation from MML to MIME, for each MIME part which has been
composed inside Emacs, an appropriate charset has to be chosen.
If you are running a non-MULE Emacs, this process is simple: If the
part contains any non-ASCII (8-bit) characters, the MIME charset given
by ‘mail-parse-charset’ (a symbol) is used. (Never set this variable
directly, though. If you want to change the default charset, please
consult the documentation of the package which you use to process MIME
messages. Various Message Variables (message)Various Message
Variables, for example.) If there are only ASCII characters, the MIME
charset US-ASCII is used, of course.
Things are slightly more complicated when running Emacs with MULE
support. In this case, a list of the MULE charsets used in the part is
obtained, and the MULE charsets are translated to MIME charsets by
consulting the table provided by Emacs itself or the variable
‘mm-mime-mule-charset-alist’ for XEmacs. If this results in a single
MIME charset, this is used to encode the part. But if the resulting
list of MIME charsets contains more than one element, two things can
happen: If it is possible to encode the part via UTF-8, this charset is
used. (For this, Emacs must support the ‘utf-8’ coding system, and the
part must consist entirely of characters which have Unicode
counterparts.) If UTF-8 is not available for some reason, the part is
split into several ones, so that each one can be encoded with a single
MIME charset. The part can only be split at line boundaries, though—if
more than one MIME charset is required to encode a single line, it is
not possible to encode the part.
When running Emacs with MULE support, the preferences for which
coding system to use is inherited from Emacs itself. This means that if
Emacs is set up to prefer UTF-8, it will be used when encoding messages.
You can modify this by altering the ‘mm-coding-system-priorities’
variable though (Encoding Customization).
The charset to be used can be overridden by setting the ‘charset’ MML
tag (MML Definition) when composing the message.
The encoding of characters (quoted-printable, 8bit, etc.) is
orthogonal to the discussion here, and is controlled by the variables
‘mm-body-charset-encoding-alist’ and
‘mm-content-transfer-encoding-defaults’ (Encoding
Customization).