gawk: More Complex

 
 1.5 A More Complex Example
 ==========================
 
 Now that we've mastered some simple tasks, let's look at what typical
 'awk' programs do.  This example shows how 'awk' can be used to
 summarize, select, and rearrange the output of another utility.  It uses
 features that haven't been covered yet, so don't worry if you don't
 understand all the details:
 
      ls -l | awk '$6 == "Nov" { sum += $5 }
                   END { print sum }'
 
    This command prints the total number of bytes in all the files in the
 current directory that were last modified in November (of any year).
 The 'ls -l' part of this example is a system command that gives you a
 listing of the files in a directory, including each file's size and the
 date the file was last modified.  Its output looks like this:
 
      -rw-r--r--  1 arnold   user   1933 Nov  7 13:05 Makefile
      -rw-r--r--  1 arnold   user  10809 Nov  7 13:03 awk.h
      -rw-r--r--  1 arnold   user    983 Apr 13 12:14 awk.tab.h
      -rw-r--r--  1 arnold   user  31869 Jun 15 12:20 awkgram.y
      -rw-r--r--  1 arnold   user  22414 Nov  7 13:03 awk1.c
      -rw-r--r--  1 arnold   user  37455 Nov  7 13:03 awk2.c
      -rw-r--r--  1 arnold   user  27511 Dec  9 13:07 awk3.c
      -rw-r--r--  1 arnold   user   7989 Nov  7 13:03 awk4.c
 
 The first field contains read-write permissions, the second field
 contains the number of links to the file, and the third field identifies
 the file's owner.  The fourth field identifies the file's group.  The
 fifth field contains the file's size in bytes.  The sixth, seventh, and
 eighth fields contain the month, day, and time, respectively, that the
 file was last modified.  Finally, the ninth field contains the file
 name.
 
    The '$6 == "Nov"' in our 'awk' program is an expression that tests
 whether the sixth field of the output from 'ls -l' matches the string
 'Nov'.  Each time a line has the string 'Nov' for its sixth field, 'awk'
 performs the action 'sum += $5'.  This adds the fifth field (the file's
 size) to the variable 'sum'.  As a result, when 'awk' has finished
 reading all the input lines, 'sum' is the total of the sizes of the
 files whose lines matched the pattern.  (This works because 'awk'
 variables are automatically initialized to zero.)
 
    After the last line of output from 'ls' has been processed, the 'END'
 rule executes and prints the value of 'sum'.  In this example, the value
 of 'sum' is 80600.
 
    These more advanced 'awk' techniques are covered in later minor nodes
 (SeeAction Overview).  Before you can move on to more advanced
 'awk' programming, you have to know how 'awk' interprets your input and
 displays your output.  By manipulating fields and using 'print'
 statements, you can produce some very useful and impressive-looking
 reports.