eintr: Digression into C

 
 8.4 Digression into C
 =====================
 
 The ‘copy-region-as-kill’ function (See‘copy-region-as-kill’
 copy-region-as-kill.) uses the ‘filter-buffer-substring’ function, which
 in turn uses the ‘delete-and-extract-region’ function.  It removes the
 contents of a region and you cannot get them back.
 
    Unlike the other code discussed here, the ‘delete-and-extract-region’
 function is not written in Emacs Lisp; it is written in C and is one of
 the primitives of the GNU Emacs system.  Since it is very simple, I will
 digress briefly from Lisp and describe it here.
 
    Like many of the other Emacs primitives, ‘delete-and-extract-region’
 is written as an instance of a C macro, a macro being a template for
 code.  The complete macro looks like this:
 
      DEFUN ("delete-and-extract-region", Fdelete_and_extract_region,
             Sdelete_and_extract_region, 2, 2, 0,
             doc: /* Delete the text between START and END and return it.  */)
             (Lisp_Object start, Lisp_Object end)
      {
        validate_region (&start, &end);
        if (XINT (start) == XINT (end))
          return empty_unibyte_string;
        return del_range_1 (XINT (start), XINT (end), 1, 1);
      }
 
    Without going into the details of the macro writing process, let me
 point out that this macro starts with the word ‘DEFUN’.  The word
 ‘DEFUN’ was chosen since the code serves the same purpose as ‘defun’
 does in Lisp.  (The ‘DEFUN’ C macro is defined in ‘emacs/src/lisp.h’.)
 
    The word ‘DEFUN’ is followed by seven parts inside of parentheses:
 
    • The first part is the name given to the function in Lisp,
      ‘delete-and-extract-region’.
 
    • The second part is the name of the function in C,
      ‘Fdelete_and_extract_region’.  By convention, it starts with ‘F’.
      Since C does not use hyphens in names, underscores are used
      instead.
 
    • The third part is the name for the C constant structure that
      records information on this function for internal use.  It is the
      name of the function in C but begins with an ‘S’ instead of an ‘F’.
 
    • The fourth and fifth parts specify the minimum and maximum number
      of arguments the function can have.  This function demands exactly
      2 arguments.
 
    • The sixth part is nearly like the argument that follows the
      ‘interactive’ declaration in a function written in Lisp: a letter
      followed, perhaps, by a prompt.  The only difference from Lisp is
      when the macro is called with no arguments.  Then you write a ‘0’
      (which is a null string), as in this macro.
 
      If you were to specify arguments, you would place them between
      quotation marks.  The C macro for ‘goto-char’ includes ‘"NGoto
      char: "’ in this position to indicate that the function expects a
      raw prefix, in this case, a numerical location in a buffer, and
      provides a prompt.
 
    • The seventh part is a documentation string, just like the one for a
      function written in Emacs Lisp.  This is written as a C comment.
      (When you build Emacs, the program ‘lib-src/make-docfile’ extracts
      these comments and uses them to make the documentation.)
 
    In a C macro, the formal parameters come next, with a statement of
 what kind of object they are, followed by the body of the macro.  For
 ‘delete-and-extract-region’ the body consists of the following four
 lines:
 
      validate_region (&start, &end);
      if (XINT (start) == XINT (end))
        return empty_unibyte_string;
      return del_range_1 (XINT (start), XINT (end), 1, 1);
 
    The ‘validate_region’ function checks whether the values passed as
 the beginning and end of the region are the proper type and are within
 range.  If the beginning and end positions are the same, then return an
 empty string.
 
    The ‘del_range_1’ function actually deletes the text.  It is a
 complex function we will not look into.  It updates the buffer and does
 other things.  However, it is worth looking at the two arguments passed
 to ‘del_range_1’.  These are ‘XINT (start)’ and ‘XINT (end)’.
 
    As far as the C language is concerned, ‘start’ and ‘end’ are two
 integers that mark the beginning and end of the region to be deleted(1).
 
    Integer widths depend on the machine, and are typically 32 or 64
 bits.  A few of the bits are used to specify the type of information;
 the remaining bits are used as content.
 
    ‘XINT’ is a C macro that extracts the relevant number from the longer
 collection of bits; the type bits are discarded.
 
    The command in ‘delete-and-extract-region’ looks like this:
 
      del_range_1 (XINT (start), XINT (end), 1, 1);
 
 It deletes the region between the beginning position, ‘start’, and the
 ending position, ‘end’.
 
    From the point of view of the person writing Lisp, Emacs is all very
 simple; but hidden underneath is a great deal of complexity to make it
 all work.
 
    ---------- Footnotes ----------
 
    (1) More precisely, and requiring more expert knowledge to
 understand, the two integers are of type ‘Lisp_Object’, which can also
 be a C union instead of an integer type.