octave: Creating Sparse Matrices in Oct-Files

 
 A.1.6.2 Creating Sparse Matrices in Oct-Files
 .............................................
 
 There are two useful strategies for creating a sparse matrix.  The first
 is to create three vectors representing the row index, column index, and
 data values, and from these create the matrix.  The second alternative
 is to create a sparse matrix with the appropriate amount of space, and
 then fill in the values.  Both techniques have their advantages and
 disadvantages.
 
    Below is an example of creating a small sparse matrix using the first
 technique
 
      int nz, nr, nc;
      nz = 4, nr = 3, nc = 4;
 
      ColumnVector ridx (nz);
      ColumnVector cidx (nz);
      ColumnVector data (nz);
 
      ridx(0) = 1; cidx(0) = 1; data(0) = 1;
      ridx(1) = 2; cidx(1) = 2; data(1) = 2;
      ridx(2) = 2; cidx(2) = 4; data(2) = 3;
      ridx(3) = 3; cidx(3) = 4; data(3) = 4;
      SparseMatrix sm (data, ridx, cidx, nr, nc);
 
 which creates the matrix given in section SeeStorage of Sparse
 Matrices.  Note that the compressed matrix format is not used at the
 time of the creation of the matrix itself, but is used internally.
 
    As discussed in the chapter on Sparse Matrices, the values of the
 sparse matrix are stored in increasing column-major ordering.  Although
 the data passed by the user need not respect this requirement,
 pre-sorting the data will significantly speed up creation of the sparse
 matrix.
 
    The disadvantage of this technique for creating a sparse matrix is
 that there is a brief time when two copies of the data exist.  For
 extremely memory constrained problems this may not be the best technique
 for creating a sparse matrix.
 
    The alternative is to first create a sparse matrix with the desired
 number of nonzero elements and then later fill those elements in.
 Sample code:
 
      int nz, nr, nc;
      nz = 4, nr = 3, nc = 4;
      SparseMatrix sm (nr, nc, nz);
      sm(0,0) = 1; sm(0,1) = 2; sm(1,3) = 3; sm(2,3) = 4;
 
    This creates the same matrix as previously.  Again, although not
 strictly necessary, it is significantly faster if the sparse matrix is
 created and the elements are added in column-major ordering.  The reason
 for this is that when elements are inserted at the end of the current
 list of known elements then no element in the matrix needs to be moved
 to allow the new element to be inserted; Only the column indices need to
 be updated.
 
    There are a few further points to note about this method of creating
 a sparse matrix.  First, it is possible to create a sparse matrix with
 fewer elements than are actually inserted in the matrix.  Therefore,
 
      int nr, nc;
      nr = 3, nc = 4;
      SparseMatrix sm (nr, nc, 0);
      sm(0,0) = 1; sm(0,1) = 2; sm(1,3) = 3; sm(2,3) = 4;
 
 is perfectly valid.  However, it is a very bad idea because as each new
 element is added to the sparse matrix the matrix needs to request more
 space and reallocate memory.  This is an expensive operation that will
 significantly slow this means of creating a sparse matrix.  It is
 possible to create a sparse matrix with excess storage, so having NZ
 greater than 4 in this example is also valid.  The disadvantage is that
 the matrix occupies more memory than strictly needed.
 
    Of course, it is not always possible to know the number of nonzero
 elements prior to filling a matrix.  For this reason the additional
 unused storage of a sparse matrix can be removed after its creation with
 the ‘maybe_compress’ function.  In addition to deallocating unused
 storage, ‘maybe_compress’ can also remove zero elements from the matrix.
 The removal of zero elements from the matrix is controlled by setting
 the argument of the ‘maybe_compress’ function to be ‘true’.  However,
 the cost of removing the zeros is high because it implies re-sorting the
 elements.  If possible, it is better for the user to avoid adding the
 unnecessary zeros in the first place.  An example of the use of
 ‘maybe_compress’ is
 
      int nz, nr, nc;
      nz = 6, nr = 3, nc = 4;
 
      SparseMatrix sm1 (nr, nc, nz);
      sm1(0,0) = 1; sm1(0,1) = 2; sm1(1,3) = 3; sm1(2,3) = 4;
      sm1.maybe_compress ();   // No zero elements were added
 
      SparseMatrix sm2 (nr, nc, nz);
      sm2(0,0) = 1; sm2(0,1) = 2; sm(0,2) = 0; sm(1,2) = 0;
      sm1(1,3) = 3; sm1(2,3) = 4;
      sm2.maybe_compress (true);  // Zero elements were added
 
    The use of the ‘maybe_compress’ function should be avoided if
 possible as it will slow the creation of the matrix.
 
    A third means of creating a sparse matrix is to work directly with
 the data in compressed row format.  An example of this advanced
 technique might be
 
      octave_value arg;
      ...
      int nz, nr, nc;
      nz = 6, nr = 3, nc = 4;   // Assume we know the max # nz
      SparseMatrix sm (nr, nc, nz);
      Matrix m = arg.matrix_value ();
 
      int ii = 0;
      sm.cidx (0) = 0;
      for (int j = 1; j < nc; j++)
        {
          for (int i = 0; i < nr; i++)
            {
              double tmp = m(i,j);
              if (tmp != 0.)
                {
                  sm.data(ii) = tmp;
                  sm.ridx(ii) = i;
                  ii++;
                }
            }
          sm.cidx(j+1) = ii;
       }
      sm.maybe_compress ();  // If don't know a priori the final # of nz.
 
 which is probably the most efficient means of creating a sparse matrix.
 
    Finally, it may sometimes arise that the amount of storage initially
 created is insufficient to completely store the sparse matrix.
 Therefore, the method ‘change_capacity’ exists to reallocate the sparse
 memory.  The above example would then be modified as
 
      octave_value arg;
      ...
      int nz, nr, nc;
      nz = 6, nr = 3, nc = 4;   // Guess the number of nz elements
      SparseMatrix sm (nr, nc, nz);
      Matrix m = arg.matrix_value ();
 
      int ii = 0;
      sm.cidx (0) = 0;
      for (int j = 1; j < nc; j++)
        {
          for (int i = 0; i < nr; i++)
            {
              double tmp = m(i,j);
              if (tmp != 0.)
                {
                  if (ii == nz)
                    {
                      nz += 2;   // Add 2 more elements
                      sm.change_capacity (nz);
                    }
                  sm.data(ii) = tmp;
                  sm.ridx(ii) = i;
                  ii++;
                }
            }
          sm.cidx(j+1) = ii;
       }
      sm.maybe_compress ();  // If don't know a priori the final # of nz.
 
    Note that both increasing and decreasing the number of nonzero
 elements in a sparse matrix is expensive as it involves memory
 reallocation.  Also because parts of the matrix, though not its
 entirety, exist as old and new copies at the same time, additional
 memory is needed.  Therefore, if possible avoid changing capacity.