ccmode: Syntactic Analysis
10.1 Syntactic Analysis
=======================
The first thing CC Mode does when indenting a line of code, is to
analyze the line by calling ‘c-guess-basic-syntax’, determining the
syntactic context of the (first) construct on that line. Although this
function is mainly used internally, it can sometimes be useful in
Line-up functions (Custom Line-Up) or in functions on
‘c-special-indent-hook’ (Other Indentation).
-- Function: c-guess-basic-syntax
Determine the syntactic context of the current line.
The “syntactic context” is a list of “syntactic elements”, where each
syntactic element in turn is a list(1) Here is a brief and typical
example:
((defun-block-intro 1959))
The first thing inside each syntactic element is always a “syntactic
symbol”. It describes the kind of construct that was recognized, e.g.,
‘statement’, ‘substatement’, ‘class-open’, ‘class-close’, etc.
Syntactic Symbols, for a complete list of currently recognized
syntactic symbols and their semantics. The remaining entries are
various data associated with the recognized construct; there might be
zero or more.
Conceptually, a line of code is always indented relative to some
position higher up in the buffer (typically the indentation of the
previous line). That position is the “anchor position” in the syntactic
element. If there is an entry after the syntactic symbol in the
syntactic element list then it’s either ‘nil’ or that anchor position.
Here is an example. Suppose we had the following code as the only
thing in a C++ buffer (2):
1: void swap( int& a, int& b )
2: {
3: int tmp = a;
4: a = b;
5: b = tmp;
6: }
We can use ‘C-c C-s’ (‘c-show-syntactic-information’) to report what the
syntactic analysis is for the current line:
‘C-c C-s’ (‘c-show-syntactic-information’)
This command calculates the syntactic analysis of the current line
and displays it in the minibuffer. The command also highlights the
anchor position(s).
Running this command on line 4 of this example, we’d see in the echo
area(3):
((statement 35))
and the ‘i’ of ‘int’ on line 3 would be highlighted. This tells us that
the line is a statement and it is indented relative to buffer position
35, the highlighted position. If you were to move point to line 3 and
hit ‘C-c C-s’, you would see:
((defun-block-intro 29))
This indicates that the ‘int’ line is the first statement in a top level
function block, and is indented relative to buffer position 29, which is
the brace just after the function header.
Here’s another example:
1: int add( int val, int incr, int doit )
2: {
3: if( doit )
4: {
5: return( val + incr );
6: }
7: return( val );
8: }
Hitting ‘C-c C-s’ on line 4 gives us:
((substatement-open 46))
which tells us that this is a brace that _opens_ a substatement
block.(4)
Syntactic contexts can contain more than one element, and syntactic
elements need not have anchor positions. The most common example of
this is a “comment-only line”:
1: void draw_list( List<Drawables>& drawables )
2: {
3: // call the virtual draw() method on each element in list
4: for( int i=0; i < drawables.count(), ++i )
5: {
6: drawables[i].draw();
7: }
8: }
Hitting ‘C-c C-s’ on line 3 of this example gives:
((comment-intro) (defun-block-intro 46))
and you can see that the syntactic context contains two syntactic
elements. Notice that the first element, ‘(comment-intro)’, has no
anchor position.
---------- Footnotes ----------
(1) In CC Mode 5.28 and earlier, a syntactic element was a dotted
pair; the cons was the syntactic symbol and the cdr was the anchor
position. For compatibility’s sake, the parameter passed to a line-up
function still has this dotted pair form (Custom Line-Up).
(2) The line numbers in this and future examples don’t actually
appear in the buffer, of course!
(3) With a universal argument (i.e., ‘C-u C-c C-s’) the analysis is
inserted into the buffer as a comment on the current line.
(4) A “substatement” is the line after a conditional statement, such
as ‘if’, ‘else’, ‘while’, ‘do’, ‘switch’, etc. A “substatement block”
is a brace block following one of these conditional statements.