gnus: Charsets

 
 3.20 Charsets
 =============
 
 People use different charsets, and we have MIME to let us know what
 charsets they use.  Or rather, we wish we had.  Many people use
 newsreaders and mailers that do not understand or use MIME, and just
 send out messages without saying what character sets they use.  To help
 a bit with this, some local news hierarchies have policies that say what
 character set is the default.  For instance, the ‘fj’ hierarchy uses
 ‘iso-2022-jp’.
 
    This knowledge is encoded in the ‘gnus-group-charset-alist’ variable,
 which is an alist of regexps (use the first item to match full group
 names) and default charsets to be used when reading these groups.
 
    In addition, some people do use soi-disant MIME-aware agents that
 aren’t.  These blithely mark messages as being in ‘iso-8859-1’ even if
 they really are in ‘koi-8’.  To help here, the
 ‘gnus-newsgroup-ignored-charsets’ variable can be used.  The charsets
 that are listed here will be ignored.  The variable can be set on a
 group-by-group basis using the group parameters (SeeGroup
 Parameters).  The default value is ‘(unknown-8bit x-unknown)’, which
 includes values some agents insist on having in there.
 
    When posting, ‘gnus-group-posting-charset-alist’ is used to determine
 which charsets should not be encoded using the MIME encodings.  For
 instance, some hierarchies discourage using quoted-printable header
 encoding.
 
    This variable is an alist of regexps and permitted unencoded charsets
 for posting.  Each element of the alist has the form ‘(’TEST HEADER
 BODY-LIST‘)’, where:
 
 TEST
      is either a regular expression matching the newsgroup header or a
      variable to query,
 HEADER
      is the charset which may be left unencoded in the header (‘nil’
      means encode all charsets),
 BODY-LIST
      is a list of charsets which may be encoded using 8bit
      content-transfer encoding in the body, or one of the special values
      ‘nil’ (always encode using quoted-printable) or ‘t’ (always use
      8bit).
 
    SeeEncoding Customization (emacs-mime)Encoding Customization, for
 additional variables that control which MIME charsets are used when
 sending messages.
 
    Other charset tricks that may be useful, although not Gnus-specific:
 
    If there are several MIME charsets that encode the same Emacs
 charset, you can choose what charset to use by saying the following:
 
      (put-charset-property 'cyrillic-iso8859-5
                            'preferred-coding-system 'koi8-r)
 
    This means that Russian will be encoded using ‘koi8-r’ instead of the
 default ‘iso-8859-5’ MIME charset.
 
    If you want to read messages in ‘koi8-u’, you can cheat and say
 
      (define-coding-system-alias 'koi8-u 'koi8-r)
 
    This will almost do the right thing.
 
    And finally, to read charsets like ‘windows-1251’, you can say
 something like
 
      (codepage-setup 1251)
      (define-coding-system-alias 'windows-1251 'cp1251)