emacs: Lax Search

 
 15.9 Lax Matching During Searching
 ==================================
 
 Normally, you’d want search commands to disregard certain minor
 differences between the search string you type and the text being
 searched.  For example, sequences of whitespace characters of different
 length are usually perceived as equivalent; letter-case differences
 usually don’t matter; etc.  This is known as “character equivalence”.
 
    This section describes the Emacs lax search features, and how to
 tailor them to your needs.
 
    By default, search commands perform “lax space matching”: each space,
 or sequence of spaces, matches any sequence of one or more whitespace
 characters in the text.  (Incremental regexp search has a separate
 default; see SeeRegexp Search.)  Hence, ‘foo bar’ matches ‘foo
 bar’, ‘foo  bar’, ‘foo   bar’, and so on (but not ‘foobar’).  More
 precisely, Emacs matches each sequence of space characters in the search
 string to a regular expression specified by the variable
 ‘search-whitespace-regexp’.  For example, to make spaces match sequences
 of newlines as well as spaces, set it to ‘"[[:space:]\n]+"’.  The
 default value of this variable depends on the buffer’s major mode; most
 major modes classify spaces, tabs, and formfeed characters as
 whitespace.
 
    If you want whitespace characters to match exactly, you can turn lax
 space matching off by typing ‘M-s <SPC>’
 (‘isearch-toggle-lax-whitespace’) within an incremental search.  Another
 ‘M-s <SPC>’ turns lax space matching back on.  To disable lax whitespace
 matching for all searches, change ‘search-whitespace-regexp’ to ‘nil’;
 then each space in the search string matches exactly one space.
 
    Searches in Emacs by default ignore the case of the text they are
 searching through, if you specify the search string in lower case.
 Thus, if you specify searching for ‘foo’, then ‘Foo’ and ‘foo’ also
 match.  Regexps, and in particular character sets, behave likewise:
 ‘[ab]’ matches ‘a’ or ‘A’ or ‘b’ or ‘B’.  This feature is known as “case
 folding”, and it is supported in both incremental and non-incremental
 search modes.
 
    An upper-case letter anywhere in the search string makes the search
 case-sensitive.  Thus, searching for ‘Foo’ does not find ‘foo’ or ‘FOO’.
 This applies to regular expression search as well as to literal string
 search.  The effect ceases if you delete the upper-case letter from the
 search string.  The variable ‘search-upper-case’ controls this: if it is
 non-‘nil’ (the default), an upper-case character in the search string
 make the search case-sensitive; setting it to ‘nil’ disables this effect
 of upper-case characters.
 
    If you set the variable ‘case-fold-search’ to ‘nil’, then all letters
 must match exactly, including case.  This is a per-buffer variable;
 altering the variable normally affects only the current buffer, unless
 you change its default value.  SeeLocals.  This variable applies to
 nonincremental searches also, including those performed by the replace
 commands (SeeReplace) and the minibuffer history matching commands
 (SeeMinibuffer History).
 
    Typing ‘M-c’ or ‘M-s c’ (‘isearch-toggle-case-fold’) within an
 incremental search toggles the case sensitivity of that search.  The
 effect does not extend beyond the current incremental search, but it
 does override the effect of adding or removing an upper-case letter in
 the current search.
 
    Several related variables control case-sensitivity of searching and
 matching for specific commands or activities.  For instance,
 ‘tags-case-fold-search’ controls case sensitivity for ‘find-tag’.  To
 find these variables, do ‘M-x apropos-variable <RET> case-fold-search
 <RET>’.
 
    Case folding disregards case distinctions among characters, making
 upper-case characters match lower-case variants, and vice versa.  A
 generalization of case folding is “character folding”, which disregards
 wider classes of distinctions among similar characters.  For instance,
 under character folding the letter ‘a’ matches all of its accented
 cousins like ‘ä’ and ‘á’, i.e., the match disregards the diacritics that
 distinguish these variants.  In addition, ‘a’ matches other characters
 that resemble it, or have it as part of their graphical representation,
 such as U+249C PARENTHESIZED LATIN SMALL LETTER A and U+2100 ACCOUNT OF
 (which looks like a small ‘a’ over ‘c’).  Similarly, the ASCII
 double-quote character ‘"’ matches all the other variants of double
 quotes defined by the Unicode standard.  Finally, character folding can
 make a sequence of one or more characters match another sequence of a
 different length: for example, the sequence of two characters ‘ff’
 matches U+FB00 LATIN SMALL LIGATURE FF.  Character sequences that are
 not identical, but match under character folding are known as
 “equivalent character sequences”.
 
    Generally, search commands in Emacs do not by default perform
 character folding in order to match equivalent character sequences.  You
 can enable this behavior by customizing the variable
 ‘search-default-mode’ to ‘char-fold-to-regexp’.  SeeSearch
 Customizations.  Within an incremental search, typing ‘M-s '’
 (‘isearch-toggle-char-fold’) toggles character folding, but only for
 that search.  (Replace commands have a different default, controlled by
 a separate option; see SeeReplacement and Lax Matches.)
 
    Like with case folding, typing an explicit variant of a character,
 such as ‘ä’, as part of the search string disables character folding for
 that search.  If you delete such a character from the search string,
 this effect ceases.