wisent: Error recovery
3.4 Error recovery
==================
The error recovery mechanism of the Wisent’s parser conforms to the one
Bison uses. See (bison)Error Recovery, in the Bison manual for
details.
To recover from a syntax error you must write rules to recognize the
special token ‘error’. This is a terminal symbol that is automatically
defined and reserved for error handling.
When the parser encounters a syntax error, it pops the state stack
until it finds a state that allows shifting the ‘error’ token. After it
has been shifted, if the old look-ahead token is not acceptable to be
shifted next, the parser reads tokens and discards them until it finds a
token which is acceptable.
Strategies for error recovery depend on the choice of error rules in
the grammar. A simple and useful strategy is simply to skip the rest of
the current statement if an error is detected:
(statement (( error ?; )) ;; on error, skip until ';' is read
)
It is also useful to recover to the matching close-delimiter of an
opening-delimiter that has already been parsed:
(primary (( ?{ expr ?} ))
(( ?{ error ?} ))
...
)
Note that error recovery rules may have actions, just as any other
rules can. Here are some predefined hooks, variables, functions or
macros, useful in such actions:
-- Variable: wisent-nerrs
The number of parse errors encountered so far.
-- Variable: wisent-recovering
non-‘nil’ means that the parser is recovering. This variable only
has meaning in the scope of ‘wisent-parse’.
-- Function: wisent-error msg
Call the user supplied error reporting function with message MSG
(Report errors).
For an example of use, wisent-skip-token.
-- Function: wisent-errok
Resume generating error messages immediately for subsequent syntax
errors.
The parser suppress error message for syntax errors that happens
shortly after the first, until three consecutive input tokens have
been successfully shifted.
Calling ‘wisent-errok’ in an action, make error messages resume
immediately. No error messages will be suppressed if you call it
in an error rule’s action.
For an example of use, wisent-skip-token.
-- Function: wisent-clearin
Discard the current lookahead token. This will cause a new lexical
token to be read.
In an error rule’s action the previous lookahead token is
reanalyzed immediately. ‘wisent-clearin’ may be called to clear
this token.
For example, suppose that on a parse error, an error handling
routine is called that advances the input stream to some point
where parsing should once again commence. The next symbol returned
by the lexical scanner is probably correct. The previous lookahead
token ought to be discarded with ‘wisent-clearin’.
For an example of use, wisent-skip-token.
-- Function: wisent-abort
Abort parsing and save the lookahead token.
-- Function: wisent-set-region start end
Change the region of text matched by the current nonterminal.
START and END are respectively the beginning and end positions of
the region occupied by the group of components associated to this
nonterminal. If START or END values are not a valid positions the
region is set to ‘nil’.
For an example of use, wisent-skip-token.
-- Variable: wisent-discarding-token-functions
List of functions to be called when discarding a lexical token.
These functions receive the lexical token discarded. When the
parser encounters unexpected tokens, it can discards them, based on
what directed by error recovery rules. Either when the parser
reads tokens until one is found that can be shifted, or when an
semantic action calls the function ‘wisent-skip-token’ or
‘wisent-skip-block’. For language specific hooks, make sure you
define this as a local hook.
For example, in Semantic, this hook is set to the function
‘wisent-collect-unmatched-syntax’ to collect unmatched lexical
tokens (Useful functions).
-- Function: wisent-skip-token
Skip the lookahead token in order to resume parsing. Return ‘nil’.
Must be used in error recovery semantic actions.
It typically looks like this:
(wisent-message "%s: skip %s" $action
(wisent-token-to-string wisent-input))
(run-hook-with-args
'wisent-discarding-token-functions wisent-input)
(wisent-clearin)
(wisent-errok)))
-- Function: wisent-skip-block
Safely skip a block in order to resume parsing. Return ‘nil’.
Must be used in error recovery semantic actions.
A block is data between an open-delimiter (syntax class ‘(’) and a
matching close-delimiter (syntax class ‘)’):
(a parenthesized block)
[a block between brackets]
{a block between braces}
The following example uses ‘wisent-skip-block’ to safely skip a
block delimited by ‘LBRACE’ (‘{’) and ‘RBRACE’ (‘}’) tokens, when a
syntax error occurs in ‘other-components’:
(block ((LBRACE other-components RBRACE))
((LBRACE RBRACE))
((LBRACE error)
(wisent-skip-block))
)