Info: (elisp) Explicit Encoding

Info Catalog
elisp: Specifying Coding Systems
elisp: Coding Systems
elisp: Terminal I/O Encoding
elisp: Explicit Encoding

 
 32.10.7 Explicit Encoding and Decoding
 --------------------------------------
 
 All the operations that transfer text in and out of Emacs have the
 ability to use a coding system to encode or decode the text.  You can
 also explicitly encode and decode text using the functions in this
 section.
 
    The result of encoding, and the input to decoding, are not ordinary
 text.  They logically consist of a series of byte values; that is, a
 series of ASCII and eight-bit characters.  In unibyte buffers and
 strings, these characters have codes in the range 0 through #xFF (255).
 In a multibyte buffer or string, eight-bit characters have character
 codes higher than #xFF (Text Representations), but Emacs
 transparently converts them to their single-byte values when you encode
 or decode such text.
 
    The usual way to read a file into a buffer as a sequence of bytes, so
 you can decode the contents explicitly, is with
 ‘insert-file-contents-literally’ (Reading from Files);
 alternatively, specify a non-‘nil’ RAWFILE argument when visiting a file
 with ‘find-file-noselect’.  These methods result in a unibyte buffer.
 
    The usual way to use the byte sequence that results from explicitly
 encoding text is to copy it to a file or process—for example, to write
 it with ‘write-region’ (Writing to Files), and suppress encoding
 by binding ‘coding-system-for-write’ to ‘no-conversion’.
 
    Here are the functions to perform explicit encoding or decoding.  The
 encoding functions produce sequences of bytes; the decoding functions
 are meant to operate on sequences of bytes.  All of these functions
 discard text properties.  They also set ‘last-coding-system-used’ to the
 precise coding system they used.
 
  -- Command: encode-coding-region start end coding-system &optional
           destination
      This command encodes the text from START to END according to coding
      system CODING-SYSTEM.  Normally, the encoded text replaces the
      original text in the buffer, but the optional argument DESTINATION
      can change that.  If DESTINATION is a buffer, the encoded text is
      inserted in that buffer after point (point does not move); if it is
      ‘t’, the command returns the encoded text as a unibyte string
      without inserting it.
 
      If encoded text is inserted in some buffer, this command returns
      the length of the encoded text.
 
      The result of encoding is logically a sequence of bytes, but the
      buffer remains multibyte if it was multibyte before, and any 8-bit
      bytes are converted to their multibyte representation (Text
      Representations).
 
      Do _not_ use ‘undecided’ for CODING-SYSTEM when encoding text,
      since that may lead to unexpected results.  Instead, use
      ‘select-safe-coding-system’ (select-safe-coding-system
      User-Chosen Coding Systems.) to suggest a suitable encoding, if
      there’s no obvious pertinent value for CODING-SYSTEM.
 
  -- Function: encode-coding-string string coding-system &optional nocopy
           buffer
      This function encodes the text in STRING according to coding system
      CODING-SYSTEM.  It returns a new string containing the encoded
      text, except when NOCOPY is non-‘nil’, in which case the function
      may return STRING itself if the encoding operation is trivial.  The
      result of encoding is a unibyte string.
 
  -- Command: decode-coding-region start end coding-system &optional
           destination
      This command decodes the text from START to END according to coding
      system CODING-SYSTEM.  To make explicit decoding useful, the text
      before decoding ought to be a sequence of byte values, but both
      multibyte and unibyte buffers are acceptable (in the multibyte
      case, the raw byte values should be represented as eight-bit
      characters).  Normally, the decoded text replaces the original text
      in the buffer, but the optional argument DESTINATION can change
      that.  If DESTINATION is a buffer, the decoded text is inserted in
      that buffer after point (point does not move); if it is ‘t’, the
      command returns the decoded text as a multibyte string without
      inserting it.
 
      If decoded text is inserted in some buffer, this command returns
      the length of the decoded text.
 
      This command puts a ‘charset’ text property on the decoded text.
      The value of the property states the character set used to decode
      the original text.
 
  -- Function: decode-coding-string string coding-system &optional nocopy
           buffer
      This function decodes the text in STRING according to
      CODING-SYSTEM.  It returns a new string containing the decoded
      text, except when NOCOPY is non-‘nil’, in which case the function
      may return STRING itself if the decoding operation is trivial.  To
      make explicit decoding useful, the contents of STRING ought to be a
      unibyte string with a sequence of byte values, but a multibyte
      string is also acceptable (assuming it contains 8-bit bytes in
      their multibyte form).
 
      If optional argument BUFFER specifies a buffer, the decoded text is
      inserted in that buffer after point (point does not move).  In this
      case, the return value is the length of the decoded text.
 
      This function puts a ‘charset’ text property on the decoded text.
      The value of the property states the character set used to decode
      the original text:
 
           (decode-coding-string "Gr\374ss Gott" 'latin-1)
                ⇒ #("Grüss Gott" 0 9 (charset iso-8859-1))
 
  -- Function: decode-coding-inserted-region from to filename &optional
           visit beg end replace
      This function decodes the text from FROM to TO as if it were being
      read from file FILENAME using ‘insert-file-contents’ using the rest
      of the arguments provided.
 
      The normal way to use this function is after reading text from a
      file without decoding, if you decide you would rather have decoded
      it.  Instead of deleting the text and reading it again, this time
      with decoding, you can call this function.
Info Catalog
elisp: Specifying Coding Systems
elisp: Coding Systems
elisp: Terminal I/O Encoding