ccmode: Performance Issues
Appendix B Performance Issues
*****************************
C and its derivative languages are highly complex creatures. Often,
ambiguous code situations arise that require CC Mode to scan large
portions of the buffer to determine syntactic context. Such
pathological code can cause CC Mode to perform fairly badly. This
section gives some insight in how CC Mode operates, how that interacts
with some coding styles, and what you can use to improve performance.
The overall goal is that CC Mode shouldn’t be overly slow (i.e., take
more than a fraction of a second) in any interactive operation. I.e.,
it’s tuned to limit the maximum response time in single operations,
which is sometimes at the expense of batch-like operations like
reindenting whole blocks. If you find that CC Mode gradually gets
slower and slower in certain situations, perhaps as the file grows in
size or as the macro or comment you’re editing gets bigger, then chances
are that something isn’t working right. You should consider reporting
it, unless it’s something that’s mentioned in this section.
Because CC Mode has to scan the buffer backwards from the current
insertion point, and because C’s syntax is fairly difficult to parse in
the backwards direction, CC Mode often tries to find the nearest
position higher up in the buffer from which to begin a forward scan
(it’s typically an opening or closing parenthesis of some kind). The
farther this position is from the current insertion point, the slower it
gets.
In earlier versions of CC Mode, we used to recommend putting the
opening brace of a top-level construct(1) into the leftmost column.
Earlier still, this used to be a rigid Emacs constraint, as embodied in
the ‘beginning-of-defun’ function. CC Mode now caches syntactic
information much better, so that the delay caused by searching for such
a brace when it’s not in column 0 is minimal, except perhaps when you’ve
just moved a long way inside the file.
A special note about ‘defun-prompt-regexp’ in Java mode: The common
style is to hang the opening braces of functions and classes on the
right side of the line, and that doesn’t work well with the Emacs
approach. CC Mode comes with a constant ‘c-Java-defun-prompt-regexp’
which tries to define a regular expression usable for this style, but
there are problems with it. In some cases it can cause
‘beginning-of-defun’ to hang(2). For this reason, it is not used by
default, but if you feel adventurous, you can set ‘defun-prompt-regexp’
to it in your mode hook. In any event, setting and relying on
‘defun-prompt-regexp’ will definitely slow things down because (X)Emacs
will be doing regular expression searches a lot, so you’ll probably be
taking a hit either way!
CC Mode maintains a cache of the opening parentheses of the blocks
surrounding the point, and it adapts that cache as the point is moved
around. That means that in bad cases it can take noticeable time to
indent a line in a new surrounding, but after that it gets fast as long
as the point isn’t moved far off. The farther the point is moved, the
less useful is the cache. Since editing typically is done in “chunks”
rather than on single lines far apart from each other, the cache
typically gives good performance even when the code doesn’t fit the
Emacs approach to finding the defun starts.
XEmacs users can set the variable
‘c-enable-xemacs-performance-kludge-p’ to non-‘nil’. This tells CC Mode
to use XEmacs-specific built-in functions which, in some circumstances,
can locate the top-most opening brace much more quickly than
‘beginning-of-defun’. Preliminary testing has shown that for styles
where these braces are hung (e.g., most JDK-derived Java styles), this
hack can improve performance of the core syntax parsing routines from 3
to 60 times. However, for styles which _do_ conform to Emacs’s
recommended style of putting top-level braces in column zero, this hack
can degrade performance by about as much. Thus this variable is set to
‘nil’ by default, since the Emacs-friendly styles should be more common
(and encouraged!). Note that this variable has no effect in Emacs since
the necessary built-in functions don’t exist (in Emacs 22.1 as of this
writing in February 2007).
Text properties are used to speed up skipping over syntactic
whitespace, i.e., comments and preprocessor directives. Indenting a
line after a huge macro definition can be slow the first time, but after
that the text properties are in place and it should be fast (even after
you’ve edited other parts of the file and then moved back).
Font locking can be a CPU hog, especially the font locking done on
decoration level 3 which tries to be very accurate. Note that that
level is designed to be used with a font lock support mode that only
fontifies the text that’s actually shown, i.e., Lazy Lock or
Just-in-time Lock mode, so make sure you use one of them. Fontification
of a whole buffer with some thousand lines can often take over a minute.
That is a known weakness; the idea is that it never should happen.
The most effective way to speed up font locking is to reduce the
decoration level to 2 by setting ‘font-lock-maximum-decoration’
appropriately. That level is designed to be as pretty as possible
without sacrificing performance. Font Locking Preliminaries,
for more info.
---------- Footnotes ----------
(1) E.g., a function in C, or outermost class definition in C++ or
Java.
(2) This has been observed in Emacs 19.34 and XEmacs 19.15.