Info: (elisp) Lisp and Coding Systems

Info Catalog
elisp: Encoding and I/O
elisp: Coding Systems
elisp: User-Chosen Coding Systems
elisp: Lisp and Coding Systems

 
 32.10.3 Coding Systems in Lisp
 ------------------------------
 
 Here are the Lisp facilities for working with coding systems:
 
  -- Function: coding-system-list &optional base-only
      This function returns a list of all coding system names (symbols).
      If BASE-ONLY is non-‘nil’, the value includes only the base coding
      systems.  Otherwise, it includes alias and variant coding systems
      as well.
 
  -- Function: coding-system-p object
      This function returns ‘t’ if OBJECT is a coding system name or
      ‘nil’.
 
  -- Function: check-coding-system coding-system
      This function checks the validity of CODING-SYSTEM.  If that is
      valid, it returns CODING-SYSTEM.  If CODING-SYSTEM is ‘nil’, the
      function return ‘nil’.  For any other values, it signals an error
      whose ‘error-symbol’ is ‘coding-system-error’ (signal
      Signaling Errors.).
 
  -- Function: coding-system-eol-type coding-system
      This function returns the type of end-of-line (a.k.a. “eol”)
      conversion used by CODING-SYSTEM.  If CODING-SYSTEM specifies a
      certain eol conversion, the return value is an integer 0, 1, or 2,
      standing for ‘unix’, ‘dos’, and ‘mac’, respectively.  If
      CODING-SYSTEM doesn’t specify eol conversion explicitly, the return
      value is a vector of coding systems, each one with one of the
      possible eol conversion types, like this:
 
           (coding-system-eol-type 'latin-1)
                ⇒ [latin-1-unix latin-1-dos latin-1-mac]
 
      If this function returns a vector, Emacs will decide, as part of
      the text encoding or decoding process, what eol conversion to use.
      For decoding, the end-of-line format of the text is auto-detected,
      and the eol conversion is set to match it (e.g., DOS-style CRLF
      format will imply ‘dos’ eol conversion).  For encoding, the eol
      conversion is taken from the appropriate default coding system
      (e.g., default value of ‘buffer-file-coding-system’ for
      ‘buffer-file-coding-system’), or from the default eol conversion
      appropriate for the underlying platform.
 
  -- Function: coding-system-change-eol-conversion coding-system eol-type
      This function returns a coding system which is like CODING-SYSTEM
      except for its eol conversion, which is specified by ‘eol-type’.
      EOL-TYPE should be ‘unix’, ‘dos’, ‘mac’, or ‘nil’.  If it is ‘nil’,
      the returned coding system determines the end-of-line conversion
      from the data.
 
      EOL-TYPE may also be 0, 1 or 2, standing for ‘unix’, ‘dos’ and
      ‘mac’, respectively.
 
  -- Function: coding-system-change-text-conversion eol-coding
           text-coding
      This function returns a coding system which uses the end-of-line
      conversion of EOL-CODING, and the text conversion of TEXT-CODING.
      If TEXT-CODING is ‘nil’, it returns ‘undecided’, or one of its
      variants according to EOL-CODING.
 
  -- Function: find-coding-systems-region from to
      This function returns a list of coding systems that could be used
      to encode a text between FROM and TO.  All coding systems in the
      list can safely encode any multibyte characters in that portion of
      the text.
 
      If the text contains no multibyte characters, the function returns
      the list ‘(undecided)’.
 
  -- Function: find-coding-systems-string string
      This function returns a list of coding systems that could be used
      to encode the text of STRING.  All coding systems in the list can
      safely encode any multibyte characters in STRING.  If the text
      contains no multibyte characters, this returns the list
      ‘(undecided)’.
 
  -- Function: find-coding-systems-for-charsets charsets
      This function returns a list of coding systems that could be used
      to encode all the character sets in the list CHARSETS.
 
  -- Function: check-coding-systems-region start end coding-system-list
      This function checks whether coding systems in the list
      ‘coding-system-list’ can encode all the characters in the region
      between START and END.  If all of the coding systems in the list
      can encode the specified text, the function returns ‘nil’.  If some
      coding systems cannot encode some of the characters, the value is
      an alist, each element of which has the form ‘(CODING-SYSTEM1 POS1
      POS2 ...)’, meaning that CODING-SYSTEM1 cannot encode characters at
      buffer positions POS1, POS2, ....
 
      START may be a string, in which case END is ignored and the
      returned value references string indices instead of buffer
      positions.
 
  -- Function: detect-coding-region start end &optional highest
      This function chooses a plausible coding system for decoding the
      text from START to END.  This text should be a byte sequence, i.e.,
      unibyte text or multibyte text with only ASCII and eight-bit
      characters (Explicit Encoding).
 
      Normally this function returns a list of coding systems that could
      handle decoding the text that was scanned.  They are listed in
      order of decreasing priority.  But if HIGHEST is non-‘nil’, then
      the return value is just one coding system, the one that is highest
      in priority.
 
      If the region contains only ASCII characters except for such
      ISO-2022 control characters ISO-2022 as ‘ESC’, the value is
      ‘undecided’ or ‘(undecided)’, or a variant specifying end-of-line
      conversion, if that can be deduced from the text.
 
      If the region contains null bytes, the value is ‘no-conversion’,
      even if the region contains text encoded in some coding system.
 
  -- Function: detect-coding-string string &optional highest
      This function is like ‘detect-coding-region’ except that it
      operates on the contents of STRING instead of bytes in the buffer.
 
  -- Variable: inhibit-null-byte-detection
      If this variable has a non-‘nil’ value, null bytes are ignored when
      detecting the encoding of a region or a string.  This allows the
      encoding of text that contains null bytes to be correctly detected,
      such as Info files with Index nodes.
 
  -- Variable: inhibit-iso-escape-detection
      If this variable has a non-‘nil’ value, ISO-2022 escape sequences
      are ignored when detecting the encoding of a region or a string.
      The result is that no text is ever detected as encoded in some
      ISO-2022 encoding, and all escape sequences become visible in a
      buffer.  *Warning:* _Use this variable with extreme caution,
      because many files in the Emacs distribution use ISO-2022
      encoding._
 
  -- Function: coding-system-charset-list coding-system
      This function returns the list of character sets (Character
      Sets) supported by CODING-SYSTEM.  Some coding systems that
      support too many character sets to list them all yield special
      values:
         • If CODING-SYSTEM supports all Emacs characters, the value is
           ‘(emacs)’.
         • If CODING-SYSTEM supports all Unicode characters, the value is
           ‘(unicode)’.
         • If CODING-SYSTEM supports all ISO-2022 charsets, the value is
           ‘iso-2022’.
         • If CODING-SYSTEM supports all the characters in the internal
           coding system used by Emacs version 21 (prior to the
           implementation of internal Unicode support), the value is
           ‘emacs-mule’.
 
    Process Information Coding systems for a subprocess, in
 particular the description of the functions ‘process-coding-system’ and
 ‘set-process-coding-system’, for how to examine or set the coding
 systems used for I/O to a subprocess.
Info Catalog
elisp: Encoding and I/O
elisp: Coding Systems
elisp: User-Chosen Coding Systems