gawk: Flattening Arrays

 
 16.4.11.3 Working With All The Elements of an Array
 ...................................................
 
 To "flatten" an array is to create a structure that represents the full
 array in a fashion that makes it easy for C code to traverse the entire
 array.  Some of the code in 'extension/testext.c' does this, and also
 serves as a nice example showing how to use the APIs.
 
    We walk through that part of the code one step at a time.  First, the
 'gawk' script that drives the test extension:
 
      @load "testext"
      BEGIN {
          n = split("blacky rusty sophie raincloud lucky", pets)
          printf("pets has %d elements\n", length(pets))
          ret = dump_array_and_delete("pets", "3")
          printf("dump_array_and_delete(pets) returned %d\n", ret)
          if ("3" in pets)
              printf("dump_array_and_delete() did NOT remove index \"3\"!\n")
          else
              printf("dump_array_and_delete() did remove index \"3\"!\n")
          print ""
      }
 
 This code creates an array with 'split()' (SeeString Functions) and
 then calls 'dump_array_and_delete()'.  That function looks up the array
 whose name is passed as the first argument, and deletes the element at
 the index passed in the second argument.  The 'awk' code then prints the
 return value and checks if the element was indeed deleted.  Here is the
 C code that implements 'dump_array_and_delete()'.  It has been edited
 slightly for presentation.
 
    The first part declares variables, sets up the default return value
 in 'result', and checks that the function was called with the correct
 number of arguments:
 
      static awk_value_t *
      dump_array_and_delete(int nargs, awk_value_t *result)
      {
          awk_value_t value, value2, value3;
          awk_flat_array_t *flat_array;
          size_t count;
          char *name;
          int i;
 
          assert(result != NULL);
          make_number(0.0, result);
 
          if (nargs != 2) {
              printf("dump_array_and_delete: nargs not right "
                     "(%d should be 2)\n", nargs);
              goto out;
          }
 
    The function then proceeds in steps, as follows.  First, retrieve the
 name of the array, passed as the first argument, followed by the array
 itself.  If either operation fails, print an error message and return:
 
          /* get argument named array as flat array and print it */
          if (get_argument(0, AWK_STRING, & value)) {
              name = value.str_value.str;
              if (sym_lookup(name, AWK_ARRAY, & value2))
                  printf("dump_array_and_delete: sym_lookup of %s passed\n",
                         name);
              else {
                  printf("dump_array_and_delete: sym_lookup of %s failed\n",
                         name);
                  goto out;
              }
          } else {
              printf("dump_array_and_delete: get_argument(0) failed\n");
              goto out;
          }
 
    For testing purposes and to make sure that the C code sees the same
 number of elements as the 'awk' code, the second step is to get the
 count of elements in the array and print it:
 
          if (! get_element_count(value2.array_cookie, & count)) {
              printf("dump_array_and_delete: get_element_count failed\n");
              goto out;
          }
 
          printf("dump_array_and_delete: incoming size is %lu\n",
                 (unsigned long) count);
 
    The third step is to actually flatten the array, and then to
 double-check that the count in the 'awk_flat_array_t' is the same as the
 count just retrieved:
 
          if (! flatten_array_typed(value2.array_cookie, & flat_array,
                                    AWK_STRING, AWK_UNDEFINED)) {
              printf("dump_array_and_delete: could not flatten array\n");
              goto out;
          }
 
          if (flat_array->count != count) {
              printf("dump_array_and_delete: flat_array->count (%lu)"
                     " != count (%lu)\n",
                      (unsigned long) flat_array->count,
                      (unsigned long) count);
              goto out;
          }
 
    The fourth step is to retrieve the index of the element to be
 deleted, which was passed as the second argument.  Remember that
 argument counts passed to 'get_argument()' are zero-based, and thus the
 second argument is numbered one:
 
          if (! get_argument(1, AWK_STRING, & value3)) {
              printf("dump_array_and_delete: get_argument(1) failed\n");
              goto out;
          }
 
    The fifth step is where the "real work" is done.  The function loops
 over every element in the array, printing the index and element values.
 In addition, upon finding the element with the index that is supposed to
 be deleted, the function sets the 'AWK_ELEMENT_DELETE' bit in the
 'flags' field of the element.  When the array is released, 'gawk'
 traverses the flattened array, and deletes any elements that have this
 flag bit set:
 
          for (i = 0; i < flat_array->count; i++) {
              printf("\t%s[\"%.*s\"] = %s\n",
                  name,
                  (int) flat_array->elements[i].index.str_value.len,
                  flat_array->elements[i].index.str_value.str,
                  valrep2str(& flat_array->elements[i].value));
 
              if (strcmp(value3.str_value.str,
                         flat_array->elements[i].index.str_value.str) == 0) {
                  flat_array->elements[i].flags |= AWK_ELEMENT_DELETE;
                  printf("dump_array_and_delete: marking element \"%s\" "
                         "for deletion\n",
                      flat_array->elements[i].index.str_value.str);
              }
          }
 
    The sixth step is to release the flattened array.  This tells 'gawk'
 that the extension is no longer using the array, and that it should
 delete any elements marked for deletion.  'gawk' also frees any storage
 that was allocated, so you should not use the pointer ('flat_array' in
 this code) once you have called 'release_flattened_array()':
 
          if (! release_flattened_array(value2.array_cookie, flat_array)) {
              printf("dump_array_and_delete: could not release flattened array\n");
              goto out;
          }
 
    Finally, because everything was successful, the function sets the
 return value to success, and returns:
 
          make_number(1.0, result);
      out:
          return result;
      }
 
    Here is the output from running this part of the test:
 
      pets has 5 elements
      dump_array_and_delete: sym_lookup of pets passed
      dump_array_and_delete: incoming size is 5
              pets["1"] = "blacky"
              pets["2"] = "rusty"
              pets["3"] = "sophie"
      dump_array_and_delete: marking element "3" for deletion
              pets["4"] = "raincloud"
              pets["5"] = "lucky"
      dump_array_and_delete(pets) returned 1
      dump_array_and_delete() did remove index "3"!