gawk: Input Summary

 
 4.14 Summary
 ============
 
    * Input is split into records based on the value of 'RS'.  The
      possibilities are as follows:
 
      Value of 'RS'      Records are split on      'awk' / 'gawk'
                         ...
      ---------------------------------------------------------------------------
      Any single         That character            'awk'
      character
      The empty string   Runs of two or more       'awk'
      ('""')             newlines
      A regexp           Text that matches the     'gawk'
                         regexp
 
    * 'FNR' indicates how many records have been read from the current
      input file; 'NR' indicates how many records have been read in
      total.
 
    * 'gawk' sets 'RT' to the text matched by 'RS'.
 
    * After splitting the input into records, 'awk' further splits the
      records into individual fields, named '$1', '$2', and so on.  '$0'
      is the whole record, and 'NF' indicates how many fields there are.
      The default way to split fields is between whitespace characters.
 
    * Fields may be referenced using a variable, as in '$NF'.  Fields may
      also be assigned values, which causes the value of '$0' to be
      recomputed when it is later referenced.  Assigning to a field with
      a number greater than 'NF' creates the field and rebuilds the
      record, using 'OFS' to separate the fields.  Incrementing 'NF' does
      the same thing.  Decrementing 'NF' throws away fields and rebuilds
      the record.
 
    * Field splitting is more complicated than record splitting:
 
      Field separator value         Fields are split ...          'awk' /
                                                                  'gawk'
      ---------------------------------------------------------------------------
      'FS == " "'                   On runs of whitespace         'awk'
      'FS == ANY SINGLE             On that character             'awk'
      CHARACTER'
      'FS == REGEXP'                On text matching the regexp   'awk'
      'FS == ""'                    Such that each individual     'gawk'
                                    character is a separate
                                    field
      'FIELDWIDTHS == LIST OF       Based on character position   'gawk'
      COLUMNS'
      'FPAT == REGEXP'              On the text surrounding       'gawk'
                                    text matching the regexp
 
    * Using 'FS = "\n"' causes the entire record to be a single field
      (assuming that newlines separate records).
 
    * 'FS' may be set from the command line using the '-F' option.  This
      can also be done using command-line variable assignment.
 
    * Use 'PROCINFO["FS"]' to see how fields are being split.
 
    * Use 'getline' in its various forms to read additional records from
      the default input stream, from a file, or from a pipe or coprocess.
 
    * Use 'PROCINFO[FILE, "READ_TIMEOUT"]' to cause reads to time out for
      FILE.
 
    * Directories on the command line are fatal for standard 'awk';
      'gawk' ignores them if not in POSIX mode.