wisent: Grammar format

 
 2.1 Grammar format
 ==================
 
 To be acceptable by Wisent a context-free grammar must respect a
 particular format.  That is, must be represented as an Emacs Lisp list
 of the form:
 
    ‘(TERMINALS ASSOCS . NON-TERMINALS)’
 
 TERMINALS
      Is the list of terminal symbols used in the grammar.
 
 ASSOCS
      Specify the associativity of TERMINALS.  It is ‘nil’ when there is
      no associativity defined, or an alist of
      ‘(ASSOC-TYPE . ASSOC-VALUE)’ elements.
 
      ASSOC-TYPE must be one of the ‘default-prec’, ‘nonassoc’, ‘left’ or
      ‘right’ symbols.  When ASSOC-TYPE is ‘default-prec’, ASSOC-VALUE
      must be ‘nil’ or ‘t’ (the default).  Otherwise it is a list of
      tokens which must have been previously declared in TERMINALS.
 
      For details, see See(bison)Contextual Precedence, in the Bison
      manual.
 
 NON-TERMINALS
      Is the list of nonterminal definitions.  Each definition has the
      form:
 
      ‘(NONTERM . RULES)’
 
      Where NONTERM is the nonterminal symbol defined and RULES the list
      of rules that describe this nonterminal.  Each rule is a list:
 
      ‘(COMPONENTS [PRECEDENCE] [ACTION])’
 
      Where:
 
      COMPONENTS
           Is a list of various terminals and nonterminals that are put
           together by this rule.
 
           For example,
 
                (exp ((exp ?+ exp))          ;; exp: exp '+' exp
                     )                       ;;    ;
 
           Says that two groupings of type ‘exp’, with a ‘+’ token in
           between, can be combined into a larger grouping of type ‘exp’.
 
           By convention, a nonterminal symbol should be in lower case,
           such as ‘exp’, ‘stmt’ or ‘declaration’.  Terminal symbols
           should be upper case to distinguish them from nonterminals:
           for example, ‘INTEGER’, ‘IDENTIFIER’, ‘IF’ or ‘RETURN’.  A
           terminal symbol that represents a particular keyword in the
           language is conventionally the same as that keyword converted
           to upper case.  The terminal symbol ‘error’ is reserved for
           error recovery.
 
           Scattered among the components can be “middle-rule” actions.
           Usually only ACTION is provided (Seeaction).
 
           If COMPONENTS in a rule is ‘nil’, it means that the rule can
           match the empty string.  For example, here is how to define a
           comma-separated sequence of zero or more ‘exp’ groupings:
 
                (expseq  (nil)               ;; expseq: ;; empty
                         ((expseq1))         ;;       | expseq1
                         )                   ;;       ;
 
                (expseq1 ((exp))             ;; expseq1: exp
                         ((expseq1 ?, exp))  ;;        | expseq1 ',' exp
                         )                   ;;        ;
 
      PRECEDENCE
           Assign the rule the precedence of the given terminal item,
           overriding the precedence that would be deduced for it, that
           is the one of the last terminal in it.  Notice that only
           terminals declared in ASSOCS have a precedence level.  The
           altered rule precedence then affects how conflicts involving
           that rule are resolved.
 
           PRECEDENCE is an optional vector of one terminal item.
 
           Here is how PRECEDENCE solves the problem of unary minus.
           First, declare a precedence for a fictitious terminal symbol
           named ‘UMINUS’.  There are no tokens of this type, but the
           symbol serves to stand for its precedence:
 
                ...
                ((default-prec t) ;; This is the default
                 (left '+' '-')
                 (left '*')
                 (left UMINUS))
 
           Now the precedence of ‘UMINUS’ can be used in specific rules:
 
                (exp    ...                  ;; exp:    ...
                         ((exp ?- exp))      ;;         | exp '-' exp
                        ...                  ;;         ...
                         ((?- exp) [UMINUS]) ;;         | '-' exp %prec UMINUS
                        ...                  ;;         ...
                        )                    ;;         ;
 
           If you forget to append ‘[UMINUS]’ to the rule for unary
           minus, Wisent silently assumes that minus has its usual
           precedence.  This kind of problem can be tricky to debug,
           since one typically discovers the mistake only by testing the
           code.
 
           Using ‘(default-prec nil)’ declaration makes it easier to
           discover this kind of problem systematically.  It causes rules
           that lack a PRECEDENCE modifier to have no precedence, even if
           the last terminal symbol mentioned in their components has a
           declared precedence.
 
           If ‘(default-prec nil)’ is in effect, you must specify
           PRECEDENCE for all rules that participate in precedence
           conflict resolution.  Then you will see any shift/reduce
           conflict until you tell Wisent how to resolve it, either by
           changing your grammar or by adding an explicit precedence.
           This will probably add declarations to the grammar, but it
           helps to protect against incorrect rule precedences.
 
           The effect of ‘(default-prec nil)’ can be reversed by giving
           ‘(default-prec t)’, which is the default.
 
           For more details, see See(bison)Contextual Precedence, in
           the Bison manual.
 
           It is important to understand that ASSOCS declarations defines
           associativity but also assign a precedence level to terminals.
           All terminals declared in the same ‘left’, ‘right’ or
           ‘nonassoc’ association get the same precedence level.  The
           precedence level is increased at each new association.
 
           On the other hand, PRECEDENCE explicitly assign the precedence
           level of the given terminal to a rule.
 
      ACTION
           An action is an optional Emacs Lisp function call, like this:
 
           ‘(identity $1)’
 
           The result of an action determines the semantic value of a
           rule.
 
           From an implementation standpoint, the function call will be
           embedded in a lambda expression, and several useful local
           variables will be defined:
 
           ‘$N’
                Where N is a positive integer.  Like in Bison, the value
                of ‘$N’ is the semantic value of the Nth element of
                COMPONENTS, starting from 1.  It can be of any Lisp data
                type.
 
           ‘$regionN’
                Where N is a positive integer.  For each ‘$N’ variable
                defined there is a corresponding ‘$regionN’ variable.
                Its value is a pair ‘(START-POS . END-POS)’ that
                represent the start and end positions (in the lexical
                input stream) of the ‘$N’ value.  It can be ‘nil’ when
                the component positions are not available, like for an
                empty string component for example.
 
           ‘$region’
                Its value is the leftmost and rightmost positions of
                input data matched by all COMPONENTS in the rule.  This
                is a pair ‘(LEFTMOST-POS . RIGHTMOST-POS)’.  It can be
                ‘nil’ when components positions are not available.
 
           ‘$nterm’
                This variable is initialized with the nonterminal symbol
                (NONTERM) the rule belongs to.  It could be useful to
                improve error reporting or debugging.  It is also used to
                automatically provide incremental re-parse entry points
                for Semantic tags (SeeWisent Semantic).
 
           ‘$action’
                The value of ‘$action’ is the symbolic name of the
                current semantic action (SeeDebugging actions).
 
           When an action is not specified a default value is supplied,
           it is ‘(identity $1)’.  This means that the default semantic
           value of a rule is the value of its first component.  Excepted
           for a rule matching the empty string, for which the default
           action is to return ‘nil’.