gawk: Locales

 
 6.6 Where You Are Makes a Difference
 ====================================
 
 Modern systems support the notion of "locales": a way to tell the system
 about the local character set and language.  The ISO C standard defines
 a default '"C"' locale, which is an environment that is typical of what
 many C programmers are used to.
 
    Once upon a time, the locale setting used to affect regexp matching,
 but this is no longer true (SeeRanges and Locales).
 
    Locales can affect record splitting.  For the normal case of 'RS =
 "\n"', the locale is largely irrelevant.  For other single-character
 record separators, setting 'LC_ALL=C' in the environment will give you
 much better performance when reading records.  Otherwise, 'gawk' has to
 make several function calls, _per input character_, to find the record
 terminator.
 
    Locales can affect how dates and times are formatted (SeeTime
 Functions).  For example, a common way to abbreviate the date
 September 4, 2015, in the United States is "9/4/15."  In many countries
 in Europe, however, it is abbreviated "4.9.15."  Thus, the '%x'
 specification in a '"US"' locale might produce '9/4/15', while in a
 '"EUROPE"' locale, it might produce '4.9.15'.
 
    According to POSIX, string comparison is also affected by locales
 (similar to regular expressions).  The details are presented in See
 POSIX String Comparison.
 
    Finally, the locale affects the value of the decimal point character
 used when 'gawk' parses input data.  This is discussed in detail in
 SeeConversion.