gawk: Extension New Mechanism Goals

 
 C.5.2 Goals For A New Mechanism
 -------------------------------
 
 Some goals for the new API were:
 
    * The API should be independent of 'gawk' internals.  Changes in
      'gawk' internals should not be visible to the writer of an
      extension function.
 
    * The API should provide _binary_ compatibility across 'gawk'
      releases as long as the API itself does not change.
 
    * The API should enable extensions written in C or C++ to have
      roughly the same "appearance" to 'awk'-level code as 'awk'
      functions do.  This means that extensions should have:
 
         - The ability to access function parameters.
 
         - The ability to turn an undefined parameter into an array (call
           by reference).
 
         - The ability to create, access and update global variables.
 
         - Easy access to all the elements of an array at once ("array
           flattening") in order to loop over all the element in an easy
           fashion for C code.
 
         - The ability to create arrays (including 'gawk''s true arrays
           of arrays).
 
    Some additional important goals were:
 
    * The API should use only features in ISO C 90, so that extensions
      can be written using the widest range of C and C++ compilers.  The
      header should include the appropriate '#ifdef __cplusplus' and
      'extern "C"' magic so that a C++ compiler could be used.  (If using
      C++, the runtime system has to be smart enough to call any
      constructors and destructors, as 'gawk' is a C program.  As of this
      writing, this has not been tested.)
 
    * The API mechanism should not require access to 'gawk''s symbols(1)
      by the compile-time or dynamic linker, in order to enable creation
      of extensions that also work on MS-Windows.
 
    During development, it became clear that there were other features
 that should be available to extensions, which were also subsequently
 provided:
 
    * Extensions should have the ability to hook into 'gawk''s I/O
      redirection mechanism.  In particular, the 'xgawk' developers
      provided a so-called "open hook" to take over reading records.
      During development, this was generalized to allow extensions to
      hook into input processing, output processing, and two-way I/O.
 
    * An extension should be able to provide a "call back" function to
      perform cleanup actions when 'gawk' exits.
 
    * An extension should be able to provide a version string so that
      'gawk''s '--version' option can provide information about
      extensions as well.
 
    The requirement to avoid access to 'gawk''s symbols is, at first
 glance, a difficult one to meet.
 
    One design, apparently used by Perl and Ruby and maybe others, would
 be to make the mainline 'gawk' code into a library, with the 'gawk'
 utility a small C 'main()' function linked against the library.
 
    This seemed like the tail wagging the dog, complicating build and
 installation and making a simple copy of the 'gawk' executable from one
 system to another (or one place to another on the same system!)  into a
 chancy operation.
 
    Pat Rankin suggested the solution that was adopted.  SeeExtension
 Mechanism Outline, for the details.
 
    ---------- Footnotes ----------
 
    (1) The "symbols" are the variables and functions defined inside
 'gawk'.  Access to these symbols by code external to 'gawk' loaded
 dynamically at runtime is problematic on MS-Windows.