gawk: Input Summary
4.14 Summary
============
* Input is split into records based on the value of 'RS'. The
possibilities are as follows:
Value of 'RS' Records are split on 'awk' / 'gawk'
...
---------------------------------------------------------------------------
Any single That character 'awk'
character
The empty string Runs of two or more 'awk'
('""') newlines
A regexp Text that matches the 'gawk'
regexp
* 'FNR' indicates how many records have been read from the current
input file; 'NR' indicates how many records have been read in
total.
* 'gawk' sets 'RT' to the text matched by 'RS'.
* After splitting the input into records, 'awk' further splits the
records into individual fields, named '$1', '$2', and so on. '$0'
is the whole record, and 'NF' indicates how many fields there are.
The default way to split fields is between whitespace characters.
* Fields may be referenced using a variable, as in '$NF'. Fields may
also be assigned values, which causes the value of '$0' to be
recomputed when it is later referenced. Assigning to a field with
a number greater than 'NF' creates the field and rebuilds the
record, using 'OFS' to separate the fields. Incrementing 'NF' does
the same thing. Decrementing 'NF' throws away fields and rebuilds
the record.
* Field splitting is more complicated than record splitting:
Field separator value Fields are split ... 'awk' /
'gawk'
---------------------------------------------------------------------------
'FS == " "' On runs of whitespace 'awk'
'FS == ANY SINGLE On that character 'awk'
CHARACTER'
'FS == REGEXP' On text matching the regexp 'awk'
'FS == ""' Such that each individual 'gawk'
character is a separate
field
'FIELDWIDTHS == LIST OF Based on character position 'gawk'
COLUMNS'
'FPAT == REGEXP' On the text surrounding 'gawk'
text matching the regexp
* Using 'FS = "\n"' causes the entire record to be a single field
(assuming that newlines separate records).
* 'FS' may be set from the command line using the '-F' option. This
can also be done using command-line variable assignment.
* Use 'PROCINFO["FS"]' to see how fields are being split.
* Use 'getline' in its various forms to read additional records from
the default input stream, from a file, or from a pipe or coprocess.
* Use 'PROCINFO[FILE, "READ_TIMEOUT"]' to cause reads to time out for
FILE.
* Directories on the command line are fatal for standard 'awk';
'gawk' ignores them if not in POSIX mode.