gawk: More Complex
1.5 A More Complex Example
==========================
Now that we've mastered some simple tasks, let's look at what typical
'awk' programs do. This example shows how 'awk' can be used to
summarize, select, and rearrange the output of another utility. It uses
features that haven't been covered yet, so don't worry if you don't
understand all the details:
ls -l | awk '$6 == "Nov" { sum += $5 }
END { print sum }'
This command prints the total number of bytes in all the files in the
current directory that were last modified in November (of any year).
The 'ls -l' part of this example is a system command that gives you a
listing of the files in a directory, including each file's size and the
date the file was last modified. Its output looks like this:
-rw-r--r-- 1 arnold user 1933 Nov 7 13:05 Makefile
-rw-r--r-- 1 arnold user 10809 Nov 7 13:03 awk.h
-rw-r--r-- 1 arnold user 983 Apr 13 12:14 awk.tab.h
-rw-r--r-- 1 arnold user 31869 Jun 15 12:20 awkgram.y
-rw-r--r-- 1 arnold user 22414 Nov 7 13:03 awk1.c
-rw-r--r-- 1 arnold user 37455 Nov 7 13:03 awk2.c
-rw-r--r-- 1 arnold user 27511 Dec 9 13:07 awk3.c
-rw-r--r-- 1 arnold user 7989 Nov 7 13:03 awk4.c
The first field contains read-write permissions, the second field
contains the number of links to the file, and the third field identifies
the file's owner. The fourth field identifies the file's group. The
fifth field contains the file's size in bytes. The sixth, seventh, and
eighth fields contain the month, day, and time, respectively, that the
file was last modified. Finally, the ninth field contains the file
name.
The '$6 == "Nov"' in our 'awk' program is an expression that tests
whether the sixth field of the output from 'ls -l' matches the string
'Nov'. Each time a line has the string 'Nov' for its sixth field, 'awk'
performs the action 'sum += $5'. This adds the fifth field (the file's
size) to the variable 'sum'. As a result, when 'awk' has finished
reading all the input lines, 'sum' is the total of the sizes of the
files whose lines matched the pattern. (This works because 'awk'
variables are automatically initialized to zero.)
After the last line of output from 'ls' has been processed, the 'END'
rule executes and prints the value of 'sum'. In this example, the value
of 'sum' is 80600.
These more advanced 'awk' techniques are covered in later minor nodes
(Action Overview). Before you can move on to more advanced
'awk' programming, you have to know how 'awk' interprets your input and
displays your output. By manipulating fields and using 'print'
statements, you can produce some very useful and impressive-looking
reports.