lilypond-essay: Music representation

 
 Music representation
 --------------------
 
 Ideally, the input format for any high-level formatting system is an
 abstract description of the content.  In this case, that would be the
 music itself.  This poses a formidable problem: how can we define what
 music really is?  Instead of trying to find an answer, we have reversed
 the question.  We write a program capable of producing sheet music, and
 adjust the format to be as lean as possible.  When the format can no
 longer be trimmed down, by definition we are left with content itself.
 Our program serves as a formal definition of a music document.
 
    The syntax is also the user-interface for LilyPond, hence it is easy
 to type:
 
      {
        c'4 d'8
      }
 
 to create a quarter note on middle C (C1) and an eighth note on the D
 above middle C (D1).
 
      [image src="" alt="[image of music]" text="image of music"]
 
    On a microscopic scale, such syntax is easy to use.  On a larger
 scale, syntax also needs structure.  How else can you enter complex
 pieces like symphonies and operas?  The structure is formed by the
 concept of music expressions: by combining small fragments of music into
 larger ones, more complex music can be expressed.  For example
 
      f4
      [image src="" alt="[image of music]" text="image of music"]
 
 Simultaneous notes can be constructed by enclosing them with ‘<<’ and
 ‘>>’:
 
      <<c4 d4 e4>>
 
      [image src="" alt="[image of music]" text="image of music"]
 
 This expression is put in sequence by enclosing it in curly braces
 ‘{ ... }’:
 
      { f4 <<c4 d4 e4>> }
 
      [image src="" alt="[image of music]" text="image of music"]
 
 The above is also an expression, and so it may be combined again with
 another simultaneous expression (a half note) using ‘<<’, ‘\\’, and
 ‘>>’:
 
      << g2 \\ { f4 <<c4 d4 e4>> } >>
 
      [image src="" alt="[image of music]" text="image of music"]
 
    Such recursive structures can be specified neatly and formally in a
 context-free grammar.  The parsing code is also generated from this
 grammar.  In other words, the syntax of LilyPond is clearly and
 unambiguously defined.
 
    User-interfaces and syntax are what people see and deal with most.
 They are partly a matter of taste, and also the subject of much
 discussion.  Although discussions on taste do have their merit, they are
 not very productive.  In the larger picture of LilyPond, the importance
 of input syntax is small: inventing neat syntax is easy, while writing
 decent formatting code is much harder.  This is also illustrated by the
 line-counts for the respective components: parsing and representation
 take up less than 10% of the source code.
 
    When designing the structures used in LilyPond, we made some
 different decisions than are apparent in other software.  Consider the
 hierarchical nature of music notation:
 
      [image src="" alt="[image of music]" text="image of music"]
 
    In this case, there are pitches grouped into chords that belong to
 measures, which belong to staves.  This resembles a tidy structure of
 nested boxes:
 
 [png]
    Unfortunately, the structure is tidy because it is based on some
 excessively restrictive assumptions.  This becomes apparent if we
 consider a more complicated musical example:
 
      [image src="" alt="[image of music]" text="image of music"]
 
    In this example, staves start and stop at will, voices jump around
 between staves, and the staves have different time signatures.  Many
 software packages would struggle with reproducing this example because
 they are built on the nested box structure.  With LilyPond, on the other
 hand, we have tried to keep the input format and the structure as
 flexible as possible.