gawk: Arrays of Arrays

 
 8.6 Arrays of Arrays
 ====================
 
 'gawk' goes beyond standard 'awk''s multidimensional array access and
 provides true arrays of arrays.  Elements of a subarray are referred to
 by their own indices enclosed in square brackets, just like the elements
 of the main array.  For example, the following creates a two-element
 subarray at index '1' of the main array 'a':
 
      a[1][1] = 1
      a[1][2] = 2
 
    This simulates a true two-dimensional array.  Each subarray element
 can contain another subarray as a value, which in turn can hold other
 arrays as well.  In this way, you can create arrays of three or more
 dimensions.  The indices can be any 'awk' expressions, including scalars
 separated by commas (i.e., a regular 'awk' simulated multidimensional
 subscript).  So the following is valid in 'gawk':
 
      a[1][3][1, "name"] = "barney"
 
    Each subarray and the main array can be of different length.  In
 fact, the elements of an array or its subarray do not all have to have
 the same type.  This means that the main array and any of its subarrays
 can be nonrectangular, or jagged in structure.  You can assign a scalar
 value to the index '4' of the main array 'a', even though 'a[1]' is
 itself an array and not a scalar:
 
      a[4] = "An element in a jagged array"
 
    The terms "dimension", "row", and "column" are meaningless when
 applied to such an array, but we will use "dimension" henceforth to
 imply the maximum number of indices needed to refer to an existing
 element.  The type of any element that has already been assigned cannot
 be changed by assigning a value of a different type.  You have to first
 delete the current element, which effectively makes 'gawk' forget about
 the element at that index:
 
      delete a[4]
      a[4][5][6][7] = "An element in a four-dimensional array"
 
 This removes the scalar value from index '4' and then inserts a
 three-level nested subarray containing a scalar.  You can also delete an
 entire subarray or subarray of subarrays:
 
      delete a[4][5]
      a[4][5] = "An element in subarray a[4]"
 
    But recall that you can not delete the main array 'a' and then use it
 as a scalar.
 
    The built-in functions that take array arguments can also be used
 with subarrays.  For example, the following code fragment uses
 'length()' (SeeString Functions) to determine the number of
 elements in the main array 'a' and its subarrays:
 
      print length(a), length(a[1]), length(a[1][3])
 
 This results in the following output for our main array 'a':
 
      2, 3, 1
 
 The 'SUBSCRIPT in ARRAY' expression (SeeReference to Elements)
 works similarly for both regular 'awk'-style arrays and arrays of
 arrays.  For example, the tests '1 in a', '3 in a[1]', and '(1, "name")
 in a[1][3]' all evaluate to one (true) for our array 'a'.
 
    The 'for (item in array)' statement (SeeScanning an Array) can
 be nested to scan all the elements of an array of arrays if it is
 rectangular in structure.  In order to print the contents (scalar
 values) of a two-dimensional array of arrays (i.e., in which each
 first-level element is itself an array, not necessarily of the same
 length), you could use the following code:
 
      for (i in array)
          for (j in array[i])
              print array[i][j]
 
    The 'isarray()' function (SeeType Functions) lets you test if an
 array element is itself an array:
 
      for (i in array) {
          if (isarray(array[i]) {
              for (j in array[i]) {
                  print array[i][j]
              }
          }
          else
              print array[i]
      }
 
    If the structure of a jagged array of arrays is known in advance, you
 can often devise workarounds using control statements.  For example, the
 following code prints the elements of our main array 'a':
 
      for (i in a) {
          for (j in a[i]) {
              if (j == 3) {
                  for (k in a[i][j])
                      print a[i][j][k]
              } else
                  print a[i][j]
          }
      }
 
 SeeWalking Arrays for a user-defined function that "walks" an
 arbitrarily dimensioned array of arrays.
 
    Recall that a reference to an uninitialized array element yields a
 value of '""', the null string.  This has one important implication when
 you intend to use a subarray as an argument to a function, as
 illustrated by the following example:
 
      $ gawk 'BEGIN { split("a b c d", b[1]); print b[1][1] }'
      error-> gawk: cmd. line:1: fatal: split: second argument is not an array
 
    The way to work around this is to first force 'b[1]' to be an array
 by creating an arbitrary index:
 
      $ gawk 'BEGIN { b[1][1] = ""; split("a b c d", b[1]); print b[1][1] }'
      -| a