mh-e: Scan Line Formats

 
 Appendix A Scan Line Formats
 ****************************
 
 This appendix discusses how MH-E creates, parses, and manipulates scan
 lines.  If you have your own MH scan or inc format files, you *can*
 teach MH-E how to handle them, but it isn’t easy as you’ll see.
 
    This table lists the options in the ‘mh-scan-line-formats’
 customization group.
 
 ‘mh-adaptive-cmd-note-flag’
      On means that the message number width is determined dynamically
      (default: ‘on’).
 ‘mh-scan-format-file’
      Specifies the format file to pass to the scan program (default:
      ‘Use MH-E scan Format’).
 ‘mh-scan-prog’
      Program used to scan messages (default: ‘"scan"’).
 
    There are a couple of caveats when creating your own scan format
 file.  First, MH-E will not work if your scan lines do not include
 message numbers.  It will work poorly if you don’t dedicate a column for
 showing the current message and notations.  It is also best to keep the
 first column empty to make room for the cursor and so that text isn’t
 obscured by the current message’s overlay arrow when running in a
 terminal.  You won’t be able to use the option
 ‘mh-adaptive-cmd-note-flag’ or the threading features (See
 Threading).
 
    If you’ve created your own format to handle long message numbers,
 you’ll be pleased to know you no longer need it since MH-E adapts its
 internal format based upon the largest message number if
 ‘mh-adaptive-cmd-note-flag’ is on (the default).  If you prefer
 fixed-width message numbers, turn off ‘mh-adaptive-cmd-note-flag’ and
 call ‘mh-set-cmd-note’ with the width specified by your format file (see
 ‘mh-scan-format-file’).  For example, the default width is 4, so you
 would use ‘(mh-set-cmd-note 4)’.
 
    The default setting for ‘mh-scan-format-file’ is ‘Use MH-E scan
 Format’.  This means that the format string will be taken from the
 either ‘mh-scan-format-mh’ or ‘mh-scan-format-nmh’ depending on whether
 MH or nmh (or GNU mailutils MH) is in use.  This setting also enables
 you to turn on the option ‘mh-adaptive-cmd-note-flag’.  You can also set
 this option to ‘Use Default scan Format’ to get the same output as you
 would get if you ran ‘scan’ from the shell.  If you have a format file
 that you want MH-E to use but not MH, you can set this option to
 ‘Specify a scan Format File’ and enter the name of your format file.
 
    The scan format that MH-E uses when ‘mh-scan-format-file’ is set to
 its default of ‘Use MH-E scan Format’ is held in the variables
 ‘mh-scan-format-nmh’ and ‘mh-scan-format-mh’ depending on whether you
 are using nmh (or GNU mailutils MH) or not.  Typically, you create your
 own format files rather than modifying these variables.  The value of
 ‘mh-scan-format-nmh’ is:
 
      (concat
       "%4(msg)"
       "%<(cur)+%| %>"
       "%<{replied}-"
       "%?(nonnull(comp{to}))%<(mymbox{to})t%>"
       "%?(nonnull(comp{cc}))%<(mymbox{cc})c%>"
       "%?(nonnull(comp{bcc}))%<(mymbox{bcc})b%>"
       "%?(nonnull(comp{newsgroups}))n%>"
       "%<(zero) %>"
       "%02(mon{date})/%02(mday{date})%<{date} %|*%>"
       "%<(mymbox{from})%<{to}To:%14(decode(friendly{to}))%>%>"
       "%<(zero)%17(decode(friendly{from}))%>  "
       "%(decode{subject})%<{body}<<%{body}%>")
 
    The setting for ‘mh-scan-format-mh’ is similar, except that MH
 doesn’t have the function ‘decode’ (which is used to decode RFC 2047
 encodings).
 
    These strings are passed to the ‘scan’ program via the ‘-format’
 argument.  The formats are identical to the defaults except that
 additional hints for fontification have been added to the existing
 notations in the fifth column (remember that in Emacs, the columns start
 at 0).  The values of the fifth column, in priority order, are: ‘-’ if
 the message has been replied to, ‘t’ if an address in the ‘To:’ field
 matches one of the mailboxes of the current user, ‘c’ if the ‘Cc:’ field
 matches, ‘b’ if the ‘Bcc:’ field matches, and ‘n’ if a non-empty
 ‘Newsgroups:’ field is present.
 
    The name of the program that generates a listing of one line per
 message is held in ‘mh-scan-prog’ (default: ‘"scan"’).  Unless this
 variable contains an absolute pathname, it is assumed to be in the
 ‘mh-progs’ directory (SeeGetting Started).  You may link another
 program to ‘scan’ (see ‘mh-profile’(5)) to produce a different type of
 listing(1).
 
    If you change the format of the scan lines you’ll need to tell MH-E
 how to parse the new format.  As you will see, quite a lot of variables
 are involved to do that.  Use ‘M-x apropos <RET> mh-scan.*regexp <RET>’
 to obtain a list of these variables.  You will also have to call
 ‘mh-set-cmd-note’ if your notations are not in column 4 (columns in
 Emacs start with 0).  Note that unlike most of the user options
 described in this manual, these are variables and must be set with
 ‘setq’ instead of in a customization buffer.  For help with regular
 expressions, see SeeSyntax of Regular Expressions (emacs)Regexps.
 
    The first variable has to do with pruning out garbage.
 
 ‘mh-scan-valid-regexp’
      This regular expression describes a valid scan line.  This is used
      to eliminate error messages that are occasionally produced by
      ‘inc’(2) or ‘scan’ (default: ‘"^ *[0-9]"’).
 
    Next, many variables control how the scan lines are parsed.
 
 ‘mh-scan-body-regexp’
      This regular expression matches the message body fragment.  Note
      that the default setting of ‘mh-folder-font-lock-keywords’ expects
      this expression to contain at least one parenthesized expression
      which matches the body text as in the default of
      ‘"\\(<<\\([^\n]+\\)?\\)"’.  If this regular expression is not
      correct, the body fragment will not be highlighted with the face
      ‘mh-folder-body’.
 ‘mh-scan-cur-msg-number-regexp’
      This regular expression matches the current message.  It must match
      from the beginning of the line.  Note that the default setting of
      ‘mh-folder-font-lock-keywords’ expects this expression to contain
      at least one parenthesized expression which matches the message
      number as in the default of ‘"^\\( *[0-9]+\\+\\).*"’.  This
      expression includes the leading space and current message marker
      ‘+’ within the parenthesis since it looks better to highlight these
      items as well.  The highlighting is done with the face
      ‘mh-folder-cur-msg-number’.  This regular expression should be
      correct as it is needed by non-fontification functions.  See also
      ‘mh-note-cur’.
 ‘mh-scan-date-regexp’
      This regular expression matches a valid date.  It must *not* be
      anchored to the beginning or the end of the line.  Note that the
      default setting of ‘mh-folder-font-lock-keywords’ expects this
      expression to contain only one parenthesized expression which
      matches the date field as in the default of
      ‘"\\([0-9][0-9]/[0-9][0-9]\\)"’.  If this regular expression is not
      correct, the date will not be highlighted with the face
      ‘mh-folder-date’.
 ‘mh-scan-deleted-msg-regexp’
      This regular expression matches deleted messages.  It must match
      from the beginning of the line.  Note that the default setting of
      ‘mh-folder-font-lock-keywords’ expects this expression to contain
      at least one parenthesized expression which matches the message
      number as in the default of ‘"^\\( *[0-9]+\\)D"’.  This expression
      includes the leading space within the parenthesis since it looks
      better to highlight it as well.  The highlighting is done with the
      face ‘mh-folder-deleted’.  This regular expression should be
      correct as it is needed by non-fontification functions.  See also
      ‘mh-note-deleted’.
 ‘mh-scan-good-msg-regexp’
      This regular expression matches “good” messages.  It must match
      from the beginning of the line.  Note that the default setting of
      ‘mh-folder-font-lock-keywords’ expects this expression to contain
      at least one parenthesized expression which matches the message
      number as in the default of ‘"^\\( *[0-9]+\\)[^D^0-9]"’.  This
      expression includes the leading space within the parenthesis since
      it looks better to highlight it as well.  The highlighting is done
      with the face ‘mh-folder-msg-number’.  This regular expression
      should be correct as it is needed by non-fontification functions.
 ‘mh-scan-msg-format-regexp’
      This regular expression finds the message number width in a scan
      format.  Note that the message number must be placed in a
      parenthesized expression as in the default of
      ‘"%\\([0-9]*\\)(msg)"’.  This variable is only consulted if
      ‘mh-scan-format-file’ is set to ‘Use MH-E scan Format’.
 ‘mh-scan-msg-format-string’
      This is a format string for the width of the message number in a
      scan format.  Use ‘0%d’ for zero-filled message numbers.  This
      variable is only consulted if ‘mh-scan-format-file’ is set to ‘Use
      MH-E scan Format’ (default: ‘"%d"’).
 ‘mh-scan-msg-number-regexp’
      This regular expression extracts the message number.  It must match
      from the beginning of the line.  Note that the message number must
      be placed in a parenthesized expression as in the default of
      ‘"^ *\\([0-9]+\\)"’.
 ‘mh-scan-msg-overflow-regexp’
      This regular expression matches overflowed message numbers
      (default: ‘"^[?0-9][0-9]"’).
 ‘mh-scan-msg-search-regexp’
      This regular expression matches a particular message.  It is a
      format string; use ‘%d’ to represent the location of the message
      number within the expression as in the default of
      ‘"^[^0-9]*%d[^0-9]"’.
 ‘mh-scan-rcpt-regexp’
      This regular expression specifies the recipient in messages you
      sent.  Note that the default setting of
      ‘mh-folder-font-lock-keywords’ expects this expression to contain
      two parenthesized expressions.  The first is expected to match the
      ‘To:’ that the default scan format file generates.  The second is
      expected to match the recipient’s name as in the default of
      ‘"\\(To:\\)\\(..............\\)"’.  If this regular expression is
      not correct, the ‘To:’ string will not be highlighted with the face
      ‘mh-folder-to’ and the recipient will not be highlighted with the
      face ‘mh-folder-address’.
 ‘mh-scan-refiled-msg-regexp’
      This regular expression matches refiled messages.  It must match
      from the beginning of the line.  Note that the default setting of
      ‘mh-folder-font-lock-keywords’ expects this expression to contain
      at least one parenthesized expression which matches the message
      number as in the default of ‘"^\\( *[0-9]+\\)\\^"’.  This
      expression includes the leading space within the parenthesis since
      it looks better to highlight it as well.  The highlighting is done
      with the face ‘mh-folder-refiled’.  This regular expression should
      be correct as it is needed by non-fontification functions.  See
      also ‘mh-note-refiled’.
 ‘mh-scan-sent-to-me-sender-regexp’
      This regular expression matches messages sent to us.  Note that the
      default setting of ‘mh-folder-font-lock-keywords’ expects this
      expression to contain at least two parenthesized expressions.  The
      first should match the fontification hint (see
      ‘mh-scan-format-nmh’) and the second should match the user name as
      in the default of
      ‘"^ *[0-9]+.\\([bct]\\).....[ ]*\\(..................\\)"’.  If
      this regular expression is not correct, the notation hints will not
      be highlighted with the face ‘mh-mh-folder-sent-to-me-hint’ and the
      sender will not be highlighted with the face
      ‘mh-folder-sent-to-me-sender’.
 ‘mh-scan-subject-regexp’
      This regular expression matches the subject.  It must match from
      the beginning of the line.  Note that the default setting of
      ‘mh-folder-font-lock-keywords’ expects this expression to contain
      at least three parenthesized expressions.  The first is expected to
      match the ‘Re:’ string, if any, and is highlighted with the face
      ‘mh-folder-followup’.  The second matches an optional bracketed
      number after ‘Re:’, such as in ‘Re[2]:’ (and is thus a
      sub-expression of the first expression).  The third is expected to
      match the subject line itself which is highlighted with the face
      ‘mh-folder-subject’.  For example, the default is
      ‘"^ *[0-9]+........[ ]*...................’
      ‘\\([Rr][Ee]\\(\\[[0-9]+\\]\\)?:\\s-*\\)*\\([^<\n]*\\)"’.  This
      regular expression should be correct as it is needed by
      non-fontification functions.  Note that this example is broken up
      on two lines for readability, but is actually a single string.
 
    Finally, there are a slew of variables that control how MH-E
 annotates the scan lines.
 
 ‘mh-cmd-note’
      Column for notations (default: 4).  This variable should be set
      with the function ‘mh-set-cmd-note’.  This variable may be updated
      dynamically if ‘mh-adaptive-cmd-note-flag’ is on.  The following
      variables contain the notational characters.  Note that columns in
      Emacs start with 0.
 ‘mh-note-copied’
      Messages that have been copied are marked by this character
      (default: ‘?C’).
 ‘mh-note-cur’
      The current message (in MH, not in MH-E) is marked by this
      character (default: ‘?+’).  See also
      ‘mh-scan-cur-msg-number-regexp’.
 ‘mh-note-deleted’
      Messages that have been deleted are marked by this character
      (default: ‘?D’).  See also ‘mh-scan-deleted-msg-regexp’.
 ‘mh-note-dist’
      Messages that have been redistributed are marked by this character
      (default: ‘?R’).
 ‘mh-note-forw’
      Messages that have been forwarded are marked by this character
      (default: ‘?F’).
 ‘mh-note-printed’
      Messages that have been printed are marked by this character
      (default: ‘?P’).
 ‘mh-note-refiled’
      Messages that have been refiled are marked by this character
      (default: ‘?^’).  See also ‘mh-scan-refiled-msg-regexp’.
 ‘mh-note-repl’
      Messages that have been replied to are marked by this character
      (default: ‘?-’).
 ‘mh-note-seq’
      Messages in a user-defined sequence are marked by this character
      (default: ‘?%’).  Messages in the ‘search’ sequence are marked by
      this character as well.
 
    For example, let’s say I have the following in ‘scan.format’ which
 displays the sender, the subject, and the message number.  This format
 places a ‘+’ after the message number for the current message according
 to MH; it also uses that column for notations.
 
      %20(decode(friendly{from})) %50(decode{subject})  %4(msg)%<(cur)+%| %>
 
    The first thing you have to do is tell MH-E to use this file.
 Customize ‘mh-scan-format-file’ and set its value to ‘Use Default scan
 Format’.  If you didn’t get already turn off
 ‘mh-adaptive-cmd-note-flag’, you’ll need to do that first.
 
    Next, tell MH-E what a valid scan line looks like so that you can at
 least display the output of scan in your MH-Folder buffer.
 
      (setq mh-scan-valid-regexp "[0-9]+[+D^ ]$")
 
    Now, in order to get rid of the ‘Cursor not pointing to message’
 message, you need to tell MH-E how to access the message number.  You
 should also see why MH-E requires that you include a message number in
 the first place.
 
      (setq mh-scan-msg-number-regexp "^.* \\([0-9]+\\)[+D^ ]$")
      (setq mh-scan-msg-search-regexp " %d[+D^ ]$")
 
    In order to get the next and previous commands working, add this.
 
      (setq mh-scan-good-msg-regexp "^.* \\([0-9]+\\)[+D^ ]$")
 
    Note that the current message isn’t marked with a ‘+’ when moving
 between the next and previous messages.  Here is the code required to
 get this working.
 
      (set-mh-cmd-note 76)
      (setq mh-scan-cur-msg-number-regexp "^.* \\([0-9]+\\)\\+$")
 
    Finally, add the following to delete and refile messages.
 
      (setq mh-scan-deleted-msg-regexp "^.* \\([0-9]+\\)D$")
      (setq mh-scan-refiled-msg-regexp "^.* \\([0-9]+\\)\\^$")
 
    This is just a bare minimum; it’s best to adjust all of the regular
 expressions to ensure that MH-E and highlighting perform well.
 
    ---------- Footnotes ----------
 
    (1) See the section Find and Specify with scan pick Ranges Sequences
 (http://rand-mh.sourceforge.net/book/mh/faswsprs.html) in the MH book.
 
    (2) See the section Reading Mail: inc show next prev
 (http://rand-mh.sourceforge.net/book/mh/reapre.html) in the MH book.