Info: (ed) Regular expressions

Info Catalog
ed: Line addressing
ed: Top
ed: Commands
ed: Regular expressions

 
 5 Regular expressions
 *********************
 
 Regular expressions are patterns used in selecting text. For example,
 the 'ed' command
 
      g/STRING/
 
 prints all lines containing STRING. Regular expressions are also used
 by the 's' command for selecting old text to be replaced with new text.
 
    In addition to a specifying string literals, regular expressions can
 represent classes of strings. Strings thus represented are said to be
 matched by the corresponding regular expression. If it is possible for a
 regular expression to match several strings in a line, then the
 left-most longest match is the one selected.
 
    The following symbols are used in constructing regular expressions:
 
 'C'
      Any character C not listed below, including '{', '}', '(', ')',
      '<' and '>', matches itself.
 
 '\C'
      Any backslash-escaped character C, other than '{', '}', '(', ')',
      '<', '>', 'b', 'B', 'w', 'W', '+' and '?', matches itself.
 
 '.'
      Matches any single character.
 
 '[CHAR-CLASS]'
      Matches any single character in CHAR-CLASS. To include a ']' in
      CHAR-CLASS, it must be the first character. A range of characters
      may be specified by separating the end characters of the range
      with a '-', e.g., 'a-z' specifies the lower case characters. The
      following literal expressions can also be used in CHAR-CLASS to
      specify sets of characters:
 
           [:alnum:] [:cntrl:] [:lower:] [:space:]
           [:alpha:] [:digit:] [:print:] [:upper:]
           [:blank:] [:graph:] [:punct:] [:xdigit:]
 
      If '-' appears as the first or last character of CHAR-CLASS, then
      it matches itself. All other characters in CHAR-CLASS match
      themselves.
 
      Patterns in CHAR-CLASS of the form:
           [.COL-ELM.]
           [=COL-ELM=]
 
      where COL-ELM is a "collating element" are interpreted according
      to 'locale (5)'. See 'regex (3)' for an explanation of these
      constructs.
 
 '[^CHAR-CLASS]'
      Matches any single character, other than newline, not in
      CHAR-CLASS.  CHAR-CLASS is defined as above.
 
 '^'
      If '^' is the first character of a regular expression, then it
      anchors the regular expression to the beginning of a line.
      Otherwise, it matches itself.
 
 '$'
      If '$' is the last character of a regular expression, it anchors
      the regular expression to the end of a line. Otherwise, it matches
      itself.
 
 '\(RE\)'
      Defines a (possibly null) subexpression RE. Subexpressions may be
      nested. A subsequent backreference of the form '\N', where N is a
      number in the range [1,9], expands to the text matched by the Nth
      subexpression. For example, the regular expression '\(a.c\)\1'
      matches the string 'abcabc', but not 'abcadc'. Subexpressions are
      ordered relative to their left delimiter.
 
 '*'
      Matches the single character regular expression or subexpression
      immediately preceding it zero or more times. If '*' is the first
      character of a regular expression or subexpression, then it matches
      itself. The '*' operator sometimes yields unexpected results. For
      example, the regular expression 'b*' matches the beginning of the
      string 'abbb', as opposed to the substring 'bbb', since a null
      match is the only left-most match.
 
 '\{N,M\}'
 '\{N,\}'
 '\{N\}'
      Matches the single character regular expression or subexpression
      immediately preceding it at least N and at most M times. If M is
      omitted, then it matches at least N times. If the comma is also
      omitted, then it matches exactly N times. If any of these forms
      occurs first in a regular expression or subexpression, then it is
      interpreted literally (i.e., the regular expression '\{2\}'
      matches the string '{2}', and so on).
 
 '\<'
 '\>'
      Anchors the single character regular expression or subexpression
      immediately following it to the beginning (in the case of '\<') or
      ending (in the case of '\>') of a "word", i.e., in ASCII, a
      maximal string of alphanumeric characters, including the
      underscore (_).
 
 
    The following extended operators are preceded by a backslash '\' to
 distinguish them from traditional 'ed' syntax.
 
 '\`'
 '\''
      Unconditionally matches the beginning '\`' or ending '\'' of a
      line.
 
 '\?'
      Optionally matches the single character regular expression or
      subexpression immediately preceding it. For example, the regular
      expression 'a[bd]\?c' matches the strings 'abc', 'adc' and 'ac'.
      If '\?' occurs at the beginning of a regular expressions or
      subexpression, then it matches a literal '?'.
 
 '\+'
      Matches the single character regular expression or subexpression
      immediately preceding it one or more times. So the regular
      expression 'a+' is shorthand for 'aa*'. If '\+' occurs at the
      beginning of a regular expression or subexpression, then it
      matches a literal '+'.
 
 '\b'
      Matches the beginning or ending (null string) of a word. Thus the
      regular expression '\bhello\b' is equivalent to '\<hello\>'.
      However, '\b\b' is a valid regular expression whereas '\<\>' is
      not.
 
 '\B'
      Matches (a null string) inside a word.
 
 '\w'
      Matches any character in a word.
 
 '\W'
      Matches any character not in a word.
Info Catalog
ed: Line addressing
ed: Top
ed: Commands