elisp: Encoding and I/O
32.10.2 Encoding and I/O
------------------------
The principal purpose of coding systems is for use in reading and
writing files. The function ‘insert-file-contents’ uses a coding system
to decode the file data, and ‘write-region’ uses one to encode the
buffer contents.
You can specify the coding system to use either explicitly (
Specifying Coding Systems), or implicitly using a default mechanism
(Default Coding Systems). But these methods may not completely
specify what to do. For example, they may choose a coding system such
as ‘undecided’ which leaves the character code conversion to be
determined from the data. In these cases, the I/O operation finishes
the job of choosing a coding system. Very often you will want to find
out afterwards which coding system was chosen.
-- Variable: buffer-file-coding-system
This buffer-local variable records the coding system used for
saving the buffer and for writing part of the buffer with
‘write-region’. If the text to be written cannot be safely encoded
using the coding system specified by this variable, these
operations select an alternative encoding by calling the function
‘select-safe-coding-system’ (User-Chosen Coding Systems).
If selecting a different encoding requires to ask the user to
specify a coding system, ‘buffer-file-coding-system’ is updated to
the newly selected coding system.
‘buffer-file-coding-system’ does _not_ affect sending text to a
subprocess.
-- Variable: save-buffer-coding-system
This variable specifies the coding system for saving the buffer (by
overriding ‘buffer-file-coding-system’). Note that it is not used
for ‘write-region’.
When a command to save the buffer starts out to use
‘buffer-file-coding-system’ (or ‘save-buffer-coding-system’), and
that coding system cannot handle the actual text in the buffer, the
command asks the user to choose another coding system (by calling
‘select-safe-coding-system’). After that happens, the command also
updates ‘buffer-file-coding-system’ to represent the coding system
that the user specified.
-- Variable: last-coding-system-used
I/O operations for files and subprocesses set this variable to the
coding system name that was used. The explicit encoding and
decoding functions (Explicit Encoding) set it too.
*Warning:* Since receiving subprocess output sets this variable, it
can change whenever Emacs waits; therefore, you should copy the
value shortly after the function call that stores the value you are
interested in.
The variable ‘selection-coding-system’ specifies how to encode
selections for the window system. Window System Selections.
-- Variable: file-name-coding-system
The variable ‘file-name-coding-system’ specifies the coding system
to use for encoding file names. Emacs encodes file names using
that coding system for all file operations. If
‘file-name-coding-system’ is ‘nil’, Emacs uses a default coding
system determined by the selected language environment. In the
default language environment, any non-ASCII characters in file
names are not encoded specially; they appear in the file system
using the internal Emacs representation.
*Warning:* if you change ‘file-name-coding-system’ (or the language
environment) in the middle of an Emacs session, problems can result if
you have already visited files whose names were encoded using the
earlier coding system and are handled differently under the new coding
system. If you try to save one of these buffers under the visited file
name, saving may use the wrong file name, or it may get an error. If
such a problem happens, use ‘C-x C-w’ to specify a new file name for
that buffer.
On Windows 2000 and later, Emacs by default uses Unicode APIs to pass
file names to the OS, so the value of ‘file-name-coding-system’ is
largely ignored. Lisp applications that need to encode or decode file
names on the Lisp level should use ‘utf-8’ coding-system when
‘system-type’ is ‘windows-nt’; the conversion of UTF-8 encoded file
names to the encoding appropriate for communicating with the OS is
performed internally by Emacs.