gawk: Getline Notes
4.10.9 Points to Remember About 'getline'
-----------------------------------------
Here are some miscellaneous points about 'getline' that you should bear
in mind:
* When 'getline' changes the value of '$0' and 'NF', 'awk' does _not_
automatically jump to the start of the program and start testing
the new record against every pattern. However, the new record is
tested against any subsequent rules.
* Some very old 'awk' implementations limit the number of pipelines
that an 'awk' program may have open to just one. In 'gawk', there
is no such limit. You can open as many pipelines (and coprocesses)
as the underlying operating system permits.
* An interesting side effect occurs if you use 'getline' without a
redirection inside a 'BEGIN' rule. Because an unredirected
'getline' reads from the command-line data files, the first
'getline' command causes 'awk' to set the value of 'FILENAME'.
Normally, 'FILENAME' does not have a value inside 'BEGIN' rules,
because you have not yet started to process the command-line data
files. (d.c.) (See BEGIN/END; also Auto-set.)
* Using 'FILENAME' with 'getline' ('getline < FILENAME') is likely to
be a source of confusion. 'awk' opens a separate input stream from
the current input file. However, by not using a variable, '$0' and
'NF' are still updated. If you're doing this, it's probably by
accident, and you should reconsider what it is you're trying to
accomplish.
* Getline Summary, presents a table summarizing the 'getline'
variants and which variables they can affect. It is worth noting
that those variants that do not use redirection can cause
'FILENAME' to be updated if they cause 'awk' to start reading a new
input file.
* If the variable being assigned is an expression with side effects,
different versions of 'awk' behave differently upon encountering
end-of-file. Some versions don't evaluate the expression; many
versions (including 'gawk') do. Here is an example, courtesy of
Duncan Moore:
BEGIN {
system("echo 1 > f")
while ((getline a[++c] < "f") > 0) { }
print c
}
Here, the side effect is the '++c'. Is 'c' incremented if
end-of-file is encountered before the element in 'a' is assigned?
'gawk' treats 'getline' like a function call, and evaluates the
expression 'a[++c]' before attempting to read from 'f'. However,
some versions of 'awk' only evaluate the expression once they know
that there is a string value to be assigned.