elisp: Formatting Strings

 
 4.7 Formatting Strings
 ======================
 
 “Formatting” means constructing a string by substituting computed values
 at various places in a constant string.  This constant string controls
 how the other values are printed, as well as where they appear; it is
 called a “format string”.
 
    Formatting is often useful for computing messages to be displayed.
 In fact, the functions ‘message’ and ‘error’ provide the same formatting
 feature described here; they differ from ‘format-message’ only in how
 they use the result of formatting.
 
  -- Function: format string &rest objects
      This function returns a new string that is made by copying STRING
      and then replacing any format specification in the copy with
      encodings of the corresponding OBJECTS.  The arguments OBJECTS are
      the computed values to be formatted.
 
      The characters in STRING, other than the format specifications, are
      copied directly into the output, including their text properties,
      if any.
 
  -- Function: format-message string &rest objects
      This function acts like ‘format’, except it also converts any
      curved single quotes in STRING as per the value of
      ‘text-quoting-style’, and treats grave accent (`) and apostrophe
      (') as if they were curved single quotes.
 
      A format that quotes with grave accents and apostrophes `like this'
      typically generates curved quotes ‘like this’.  In contrast, a
      format that quotes with only apostrophes 'like this' typically
      generates two closing curved quotes ’like this’, an unusual style
      in English.  SeeKeys in Documentation, for how the
      ‘text-quoting-style’ variable affects generated quotes.
 
    A format specification is a sequence of characters beginning with a
 ‘%’.  Thus, if there is a ‘%d’ in STRING, the ‘format’ function replaces
 it with the printed representation of one of the values to be formatted
 (one of the arguments OBJECTS).  For example:
 
      (format "The value of fill-column is %d." fill-column)
           ⇒ "The value of fill-column is 72."
 
    Since ‘format’ interprets ‘%’ characters as format specifications,
 you should _never_ pass an arbitrary string as the first argument.  This
 is particularly true when the string is generated by some Lisp code.
 Unless the string is _known_ to never include any ‘%’ characters, pass
 ‘"%s"’, described below, as the first argument, and the string as the
 second, like this:
 
        (format "%s" ARBITRARY-STRING)
 
    If STRING contains more than one format specification, the format
 specifications correspond to successive values from OBJECTS.  Thus, the
 first format specification in STRING uses the first such value, the
 second format specification uses the second such value, and so on.  Any
 extra format specifications (those for which there are no corresponding
 values) cause an error.  Any extra values to be formatted are ignored.
 
    Certain format specifications require values of particular types.  If
 you supply a value that doesn’t fit the requirements, an error is
 signaled.
 
    Here is a table of valid format specifications:
 
 ‘%s’
      Replace the specification with the printed representation of the
      object, made without quoting (that is, using ‘princ’, not
      ‘prin1’—SeeOutput Functions).  Thus, strings are represented
      by their contents alone, with no ‘"’ characters, and symbols appear
      without ‘\’ characters.
 
      If the object is a string, its text properties are copied into the
      output.  The text properties of the ‘%s’ itself are also copied,
      but those of the object take priority.
 
 ‘%S’
      Replace the specification with the printed representation of the
      object, made with quoting (that is, using ‘prin1’—SeeOutput
      Functions).  Thus, strings are enclosed in ‘"’ characters, and
      ‘\’ characters appear where necessary before special characters.
 
 ‘%o’
      Replace the specification with the base-eight representation of an
      unsigned integer.
 
 ‘%d’
      Replace the specification with the base-ten representation of a
      signed integer.
 
 ‘%x’
 ‘%X’
      Replace the specification with the base-sixteen representation of
      an unsigned integer.  ‘%x’ uses lower case and ‘%X’ uses upper
      case.
 
 ‘%c’
      Replace the specification with the character which is the value
      given.
 
 ‘%e’
      Replace the specification with the exponential notation for a
      floating-point number.
 
 ‘%f’
      Replace the specification with the decimal-point notation for a
      floating-point number.
 
 ‘%g’
      Replace the specification with notation for a floating-point
      number, using either exponential notation or decimal-point
      notation.  The exponential notation is used if the exponent would
      be less than -4 or greater than or equal to the precision (default:
      6).  By default, trailing zeros are removed from the fractional
      portion of the result and a decimal-point character appears only if
      it is followed by a digit.
 
 ‘%%’
      Replace the specification with a single ‘%’.  This format
      specification is unusual in that it does not use a value.  For
      example, ‘(format "%% %d" 30)’ returns ‘"% 30"’.
 
    Any other format character results in an ‘Invalid format operation’
 error.
 
    Here are several examples, which assume the typical
 ‘text-quoting-style’ settings:
 
      (format "The octal value of %d is %o,
               and the hex value is %x." 18 18 18)
           ⇒ "The octal value of 18 is 22,
               and the hex value is 12."
 
      (format-message
       "The name of this buffer is ‘%s’." (buffer-name))
           ⇒ "The name of this buffer is ‘strings.texi’."
 
      (format-message
       "The buffer object prints as `%s'." (current-buffer))
           ⇒ "The buffer object prints as ‘strings.texi’."
 
    A specification can have a “width”, which is a decimal number between
 the ‘%’ and the specification character.  If the printed representation
 of the object contains fewer characters than this width, ‘format’
 extends it with padding.  The width specifier is ignored for the ‘%%’
 specification.  Any padding introduced by the width specifier normally
 consists of spaces inserted on the left:
 
      (format "%5d is padded on the left with spaces" 123)
           ⇒ "  123 is padded on the left with spaces"
 
 If the width is too small, ‘format’ does not truncate the object’s
 printed representation.  Thus, you can use a width to specify a minimum
 spacing between columns with no risk of losing information.  In the
 following two examples, ‘%7s’ specifies a minimum width of 7.  In the
 first case, the string inserted in place of ‘%7s’ has only 3 letters,
 and needs 4 blank spaces as padding.  In the second case, the string
 ‘"specification"’ is 13 letters wide but is not truncated.
 
      (format "The word '%7s' has %d letters in it."
              "foo" (length "foo"))
           ⇒ "The word '    foo' has 3 letters in it."
      (format "The word '%7s' has %d letters in it."
              "specification" (length "specification"))
           ⇒ "The word 'specification' has 13 letters in it."
 
    Immediately after the ‘%’ and before the optional width specifier,
 you can also put certain “flag characters”.
 
    The flag ‘+’ inserts a plus sign before a positive number, so that it
 always has a sign.  A space character as flag inserts a space before a
 positive number.  (Otherwise, positive numbers start with the first
 digit.)  These flags are useful for ensuring that positive numbers and
 negative numbers use the same number of columns.  They are ignored
 except for ‘%d’, ‘%e’, ‘%f’, ‘%g’, and if both flags are used, ‘+’ takes
 precedence.
 
    The flag ‘#’ specifies an alternate form which depends on the format
 in use.  For ‘%o’, it ensures that the result begins with a ‘0’.  For
 ‘%x’ and ‘%X’, it prefixes the result with ‘0x’ or ‘0X’.  For ‘%e’ and
 ‘%f’, the ‘#’ flag means include a decimal point even if the precision
 is zero.  For ‘%g’, it always includes a decimal point, and also forces
 any trailing zeros after the decimal point to be left in place where
 they would otherwise be removed.
 
    The flag ‘0’ ensures that the padding consists of ‘0’ characters
 instead of spaces.  This flag is ignored for non-numerical specification
 characters like ‘%s’, ‘%S’ and ‘%c’.  These specification characters
 accept the ‘0’ flag, but still pad with _spaces_.
 
    The flag ‘-’ causes the padding inserted by the width specifier, if
 any, to be inserted on the right rather than the left.  If both ‘-’ and
 ‘0’ are present, the ‘0’ flag is ignored.
 
      (format "%06d is padded on the left with zeros" 123)
           ⇒ "000123 is padded on the left with zeros"
 
      (format "'%-6d' is padded on the right" 123)
           ⇒ "'123   ' is padded on the right"
 
      (format "The word '%-7s' actually has %d letters in it."
              "foo" (length "foo"))
           ⇒ "The word 'foo    ' actually has 3 letters in it."
 
    All the specification characters allow an optional “precision” before
 the character (after the width, if present).  The precision is a
 decimal-point ‘.’ followed by a digit-string.  For the floating-point
 specifications (‘%e’ and ‘%f’), the precision specifies how many digits
 following the decimal point to show; if zero, the decimal-point itself
 is also omitted.  For ‘%g’, the precision specifies how many significant
 digits to show (significant digits are the first digit before the
 decimal point and all the digits after it).  If the precision of %g is
 zero or unspecified, it is treated as 1.  For ‘%s’ and ‘%S’, the
 precision truncates the string to the given width, so ‘%.3s’ shows only
 the first three characters of the representation for OBJECT.  For other
 specification characters, the effect of precision is what the local
 library functions of the ‘printf’ family produce.