elisp: Formatting Strings
4.7 Formatting Strings
======================
“Formatting” means constructing a string by substituting computed values
at various places in a constant string. This constant string controls
how the other values are printed, as well as where they appear; it is
called a “format string”.
Formatting is often useful for computing messages to be displayed.
In fact, the functions ‘message’ and ‘error’ provide the same formatting
feature described here; they differ from ‘format-message’ only in how
they use the result of formatting.
-- Function: format string &rest objects
This function returns a new string that is made by copying STRING
and then replacing any format specification in the copy with
encodings of the corresponding OBJECTS. The arguments OBJECTS are
the computed values to be formatted.
The characters in STRING, other than the format specifications, are
copied directly into the output, including their text properties,
if any.
-- Function: format-message string &rest objects
This function acts like ‘format’, except it also converts any
curved single quotes in STRING as per the value of
‘text-quoting-style’, and treats grave accent (`) and apostrophe
(') as if they were curved single quotes.
A format that quotes with grave accents and apostrophes `like this'
typically generates curved quotes ‘like this’. In contrast, a
format that quotes with only apostrophes 'like this' typically
generates two closing curved quotes ’like this’, an unusual style
in English. Keys in Documentation, for how the
‘text-quoting-style’ variable affects generated quotes.
A format specification is a sequence of characters beginning with a
‘%’. Thus, if there is a ‘%d’ in STRING, the ‘format’ function replaces
it with the printed representation of one of the values to be formatted
(one of the arguments OBJECTS). For example:
(format "The value of fill-column is %d." fill-column)
⇒ "The value of fill-column is 72."
Since ‘format’ interprets ‘%’ characters as format specifications,
you should _never_ pass an arbitrary string as the first argument. This
is particularly true when the string is generated by some Lisp code.
Unless the string is _known_ to never include any ‘%’ characters, pass
‘"%s"’, described below, as the first argument, and the string as the
second, like this:
(format "%s" ARBITRARY-STRING)
If STRING contains more than one format specification, the format
specifications correspond to successive values from OBJECTS. Thus, the
first format specification in STRING uses the first such value, the
second format specification uses the second such value, and so on. Any
extra format specifications (those for which there are no corresponding
values) cause an error. Any extra values to be formatted are ignored.
Certain format specifications require values of particular types. If
you supply a value that doesn’t fit the requirements, an error is
signaled.
Here is a table of valid format specifications:
‘%s’
Replace the specification with the printed representation of the
object, made without quoting (that is, using ‘princ’, not
‘prin1’—Output Functions). Thus, strings are represented
by their contents alone, with no ‘"’ characters, and symbols appear
without ‘\’ characters.
If the object is a string, its text properties are copied into the
output. The text properties of the ‘%s’ itself are also copied,
but those of the object take priority.
‘%S’
Replace the specification with the printed representation of the
object, made with quoting (that is, using ‘prin1’—Output
Functions). Thus, strings are enclosed in ‘"’ characters, and
‘\’ characters appear where necessary before special characters.
‘%o’
Replace the specification with the base-eight representation of an
unsigned integer.
‘%d’
Replace the specification with the base-ten representation of a
signed integer.
‘%x’
‘%X’
Replace the specification with the base-sixteen representation of
an unsigned integer. ‘%x’ uses lower case and ‘%X’ uses upper
case.
‘%c’
Replace the specification with the character which is the value
given.
‘%e’
Replace the specification with the exponential notation for a
floating-point number.
‘%f’
Replace the specification with the decimal-point notation for a
floating-point number.
‘%g’
Replace the specification with notation for a floating-point
number, using either exponential notation or decimal-point
notation. The exponential notation is used if the exponent would
be less than -4 or greater than or equal to the precision (default:
6). By default, trailing zeros are removed from the fractional
portion of the result and a decimal-point character appears only if
it is followed by a digit.
‘%%’
Replace the specification with a single ‘%’. This format
specification is unusual in that it does not use a value. For
example, ‘(format "%% %d" 30)’ returns ‘"% 30"’.
Any other format character results in an ‘Invalid format operation’
error.
Here are several examples, which assume the typical
‘text-quoting-style’ settings:
(format "The octal value of %d is %o,
and the hex value is %x." 18 18 18)
⇒ "The octal value of 18 is 22,
and the hex value is 12."
(format-message
"The name of this buffer is ‘%s’." (buffer-name))
⇒ "The name of this buffer is ‘strings.texi’."
(format-message
"The buffer object prints as `%s'." (current-buffer))
⇒ "The buffer object prints as ‘strings.texi’."
A specification can have a “width”, which is a decimal number between
the ‘%’ and the specification character. If the printed representation
of the object contains fewer characters than this width, ‘format’
extends it with padding. The width specifier is ignored for the ‘%%’
specification. Any padding introduced by the width specifier normally
consists of spaces inserted on the left:
(format "%5d is padded on the left with spaces" 123)
⇒ " 123 is padded on the left with spaces"
If the width is too small, ‘format’ does not truncate the object’s
printed representation. Thus, you can use a width to specify a minimum
spacing between columns with no risk of losing information. In the
following two examples, ‘%7s’ specifies a minimum width of 7. In the
first case, the string inserted in place of ‘%7s’ has only 3 letters,
and needs 4 blank spaces as padding. In the second case, the string
‘"specification"’ is 13 letters wide but is not truncated.
(format "The word '%7s' has %d letters in it."
"foo" (length "foo"))
⇒ "The word ' foo' has 3 letters in it."
(format "The word '%7s' has %d letters in it."
"specification" (length "specification"))
⇒ "The word 'specification' has 13 letters in it."
Immediately after the ‘%’ and before the optional width specifier,
you can also put certain “flag characters”.
The flag ‘+’ inserts a plus sign before a positive number, so that it
always has a sign. A space character as flag inserts a space before a
positive number. (Otherwise, positive numbers start with the first
digit.) These flags are useful for ensuring that positive numbers and
negative numbers use the same number of columns. They are ignored
except for ‘%d’, ‘%e’, ‘%f’, ‘%g’, and if both flags are used, ‘+’ takes
precedence.
The flag ‘#’ specifies an alternate form which depends on the format
in use. For ‘%o’, it ensures that the result begins with a ‘0’. For
‘%x’ and ‘%X’, it prefixes the result with ‘0x’ or ‘0X’. For ‘%e’ and
‘%f’, the ‘#’ flag means include a decimal point even if the precision
is zero. For ‘%g’, it always includes a decimal point, and also forces
any trailing zeros after the decimal point to be left in place where
they would otherwise be removed.
The flag ‘0’ ensures that the padding consists of ‘0’ characters
instead of spaces. This flag is ignored for non-numerical specification
characters like ‘%s’, ‘%S’ and ‘%c’. These specification characters
accept the ‘0’ flag, but still pad with _spaces_.
The flag ‘-’ causes the padding inserted by the width specifier, if
any, to be inserted on the right rather than the left. If both ‘-’ and
‘0’ are present, the ‘0’ flag is ignored.
(format "%06d is padded on the left with zeros" 123)
⇒ "000123 is padded on the left with zeros"
(format "'%-6d' is padded on the right" 123)
⇒ "'123 ' is padded on the right"
(format "The word '%-7s' actually has %d letters in it."
"foo" (length "foo"))
⇒ "The word 'foo ' actually has 3 letters in it."
All the specification characters allow an optional “precision” before
the character (after the width, if present). The precision is a
decimal-point ‘.’ followed by a digit-string. For the floating-point
specifications (‘%e’ and ‘%f’), the precision specifies how many digits
following the decimal point to show; if zero, the decimal-point itself
is also omitted. For ‘%g’, the precision specifies how many significant
digits to show (significant digits are the first digit before the
decimal point and all the digits after it). If the precision of %g is
zero or unspecified, it is treated as 1. For ‘%s’ and ‘%S’, the
precision truncates the string to the given width, so ‘%.3s’ shows only
the first three characters of the representation for OBJECT. For other
specification characters, the effect of precision is what the local
library functions of the ‘printf’ family produce.