eintr: Digression into C
8.4 Digression into C
=====================
The ‘copy-region-as-kill’ function (‘copy-region-as-kill’
copy-region-as-kill.) uses the ‘filter-buffer-substring’ function, which
in turn uses the ‘delete-and-extract-region’ function. It removes the
contents of a region and you cannot get them back.
Unlike the other code discussed here, the ‘delete-and-extract-region’
function is not written in Emacs Lisp; it is written in C and is one of
the primitives of the GNU Emacs system. Since it is very simple, I will
digress briefly from Lisp and describe it here.
Like many of the other Emacs primitives, ‘delete-and-extract-region’
is written as an instance of a C macro, a macro being a template for
code. The complete macro looks like this:
DEFUN ("delete-and-extract-region", Fdelete_and_extract_region,
Sdelete_and_extract_region, 2, 2, 0,
doc: /* Delete the text between START and END and return it. */)
(Lisp_Object start, Lisp_Object end)
{
validate_region (&start, &end);
if (XINT (start) == XINT (end))
return empty_unibyte_string;
return del_range_1 (XINT (start), XINT (end), 1, 1);
}
Without going into the details of the macro writing process, let me
point out that this macro starts with the word ‘DEFUN’. The word
‘DEFUN’ was chosen since the code serves the same purpose as ‘defun’
does in Lisp. (The ‘DEFUN’ C macro is defined in ‘emacs/src/lisp.h’.)
The word ‘DEFUN’ is followed by seven parts inside of parentheses:
• The first part is the name given to the function in Lisp,
‘delete-and-extract-region’.
• The second part is the name of the function in C,
‘Fdelete_and_extract_region’. By convention, it starts with ‘F’.
Since C does not use hyphens in names, underscores are used
instead.
• The third part is the name for the C constant structure that
records information on this function for internal use. It is the
name of the function in C but begins with an ‘S’ instead of an ‘F’.
• The fourth and fifth parts specify the minimum and maximum number
of arguments the function can have. This function demands exactly
2 arguments.
• The sixth part is nearly like the argument that follows the
‘interactive’ declaration in a function written in Lisp: a letter
followed, perhaps, by a prompt. The only difference from Lisp is
when the macro is called with no arguments. Then you write a ‘0’
(which is a null string), as in this macro.
If you were to specify arguments, you would place them between
quotation marks. The C macro for ‘goto-char’ includes ‘"NGoto
char: "’ in this position to indicate that the function expects a
raw prefix, in this case, a numerical location in a buffer, and
provides a prompt.
• The seventh part is a documentation string, just like the one for a
function written in Emacs Lisp. This is written as a C comment.
(When you build Emacs, the program ‘lib-src/make-docfile’ extracts
these comments and uses them to make the documentation.)
In a C macro, the formal parameters come next, with a statement of
what kind of object they are, followed by the body of the macro. For
‘delete-and-extract-region’ the body consists of the following four
lines:
validate_region (&start, &end);
if (XINT (start) == XINT (end))
return empty_unibyte_string;
return del_range_1 (XINT (start), XINT (end), 1, 1);
The ‘validate_region’ function checks whether the values passed as
the beginning and end of the region are the proper type and are within
range. If the beginning and end positions are the same, then return an
empty string.
The ‘del_range_1’ function actually deletes the text. It is a
complex function we will not look into. It updates the buffer and does
other things. However, it is worth looking at the two arguments passed
to ‘del_range_1’. These are ‘XINT (start)’ and ‘XINT (end)’.
As far as the C language is concerned, ‘start’ and ‘end’ are two
integers that mark the beginning and end of the region to be deleted(1).
Integer widths depend on the machine, and are typically 32 or 64
bits. A few of the bits are used to specify the type of information;
the remaining bits are used as content.
‘XINT’ is a C macro that extracts the relevant number from the longer
collection of bits; the type bits are discarded.
The command in ‘delete-and-extract-region’ looks like this:
del_range_1 (XINT (start), XINT (end), 1, 1);
It deletes the region between the beginning position, ‘start’, and the
ending position, ‘end’.
From the point of view of the person writing Lisp, Emacs is all very
simple; but hidden underneath is a great deal of complexity to make it
all work.
---------- Footnotes ----------
(1) More precisely, and requiring more expert knowledge to
understand, the two integers are of type ‘Lisp_Object’, which can also
be a C union instead of an integer type.