gawk: I/O And BEGIN/END
7.1.4.2 Input/Output from 'BEGIN' and 'END' Rules
.................................................
There are several (sometimes subtle) points to be aware of when doing
I/O from a 'BEGIN' or 'END' rule. The first has to do with the value of
'$0' in a 'BEGIN' rule. Because 'BEGIN' rules are executed before any
input is read, there simply is no input record, and therefore no fields,
when executing 'BEGIN' rules. References to '$0' and the fields yield a
null string or zero, depending upon the context. One way to give '$0' a
real value is to execute a 'getline' command without a variable (
Getline). Another way is simply to assign a value to '$0'.
The second point is similar to the first, but from the other
direction. Traditionally, due largely to implementation issues, '$0'
and 'NF' were _undefined_ inside an 'END' rule. The POSIX standard
specifies that 'NF' is available in an 'END' rule. It contains the
number of fields from the last input record. Most probably due to an
oversight, the standard does not say that '$0' is also preserved,
although logically one would think that it should be. In fact, all of
BWK 'awk', 'mawk', and 'gawk' preserve the value of '$0' for use in
'END' rules. Be aware, however, that some other implementations and
many older versions of Unix 'awk' do not.
The third point follows from the first two. The meaning of 'print'
inside a 'BEGIN' or 'END' rule is the same as always: 'print $0'. If
'$0' is the null string, then this prints an empty record. Many
longtime 'awk' programmers use an unadorned 'print' in 'BEGIN' and 'END'
rules, to mean 'print ""', relying on '$0' being null. Although one
might generally get away with this in 'BEGIN' rules, it is a very bad
idea in 'END' rules, at least in 'gawk'. It is also poor style, because
if an empty line is needed in the output, the program should print one
explicitly.
Finally, the 'next' and 'nextfile' statements are not allowed in a
'BEGIN' rule, because the implicit
read-a-record-and-match-against-the-rules loop has not started yet.
Similarly, those statements are not valid in an 'END' rule, because all
DONTPRINTYET the input has been read. (Next Statement and *noteNextfile
DONTPRINTYET the input has been read. (Next Statement and Nextfile
Statement.)