gawk: Library Names

 
 10.1 Naming Library Function Global Variables
 =============================================
 
 Due to the way the 'awk' language evolved, variables are either "global"
 (usable by the entire program) or "local" (usable just by a specific
 function).  There is no intermediate state analogous to 'static'
 variables in C.
 
    Library functions often need to have global variables that they can
 use to preserve state information between calls to the function--for
 example, 'getopt()''s variable '_opti' (SeeGetopt Function).  Such
 variables are called "private", as the only functions that need to use
 them are the ones in the library.
 
    When writing a library function, you should try to choose names for
 your private variables that will not conflict with any variables used by
 either another library function or a user's main program.  For example,
 a name like 'i' or 'j' is not a good choice, because user programs often
 use variable names like these for their own purposes.
 
    The example programs shown in this major node all start the names of
 their private variables with an underscore ('_').  Users generally don't
 use leading underscores in their variable names, so this convention
 immediately decreases the chances that the variable names will be
 accidentally shared with the user's program.
 
    In addition, several of the library functions use a prefix that helps
 indicate what function or set of functions use the variables--for
 example, '_pw_byname()' in the user database routines (SeePasswd
 Functions).  This convention is recommended, as it even further
 decreases the chance of inadvertent conflict among variable names.  Note
 that this convention is used equally well for variable names and for
 private function names.(1)
 
    As a final note on variable naming, if a function makes global
 variables available for use by a main program, it is a good convention
 to start those variables' names with a capital letter--for example,
 'getopt()''s 'Opterr' and 'Optind' variables (SeeGetopt Function).
 The leading capital letter indicates that it is global, while the fact
 that the variable name is not all capital letters indicates that the
 variable is not one of 'awk''s predefined variables, such as 'FS'.
 
    It is also important that _all_ variables in library functions that
 do not need to save state are, in fact, declared local.(2)  If this is
 not done, the variables could accidentally be used in the user's
 program, leading to bugs that are very difficult to track down:
 
      function lib_func(x, y,    l1, l2)
      {
          ...
          # some_var should be local but by oversight is not
          USE VARIABLE some_var
          ...
      }
 
    A different convention, common in the Tcl community, is to use a
 single associative array to hold the values needed by the library
 function(s), or "package."  This significantly decreases the number of
 actual global names in use.  For example, the functions described in
 SeePasswd Functions might have used array elements
 'PW_data["inited"]', 'PW_data["total"]', 'PW_data["count"]', and
 'PW_data["awklib"]', instead of '_pw_inited', '_pw_awklib', '_pw_total',
 and '_pw_count'.
 
    The conventions presented in this minor node are exactly that:
 conventions.  You are not required to write your programs this way--we
 merely recommend that you do so.
 
    ---------- Footnotes ----------
 
    (1) Although all the library routines could have been rewritten to
 use this convention, this was not done, in order to show how our own
 'awk' programming style has evolved and to provide some basis for this
 discussion.
 
    (2) 'gawk''s '--dump-variables' command-line option is useful for
 verifying this.