Info: (fftw3) FFTW MPI Wisdom

Info Catalog
fftw3: FFTW MPI Transposes
fftw3: Distributed-memory FFTW with MPI
fftw3: Avoiding MPI Deadlocks
fftw3: FFTW MPI Wisdom

 
 6.8 FFTW MPI Wisdom
 ===================
 
 FFTW's "wisdom" facility (Words of Wisdom-Saving Plans) can be
 used to save MPI plans as well as to save uniprocessor plans.  However,
 for MPI there are several unavoidable complications.
 
    First, the MPI standard does not guarantee that every process can
 perform file I/O (at least, not using C stdio routines)--in general, we
 may only assume that process 0 is capable of I/O.(1) So, if we want to
 export the wisdom from a single process to a file, we must first export
 the wisdom to a string, then send it to process 0, then write it to a
 file.
 
    Second, in principle we may want to have separate wisdom for every
 process, since in general the processes may run on different hardware
 even for a single MPI program.  However, in practice FFTW's MPI code is
 designed for the case of homogeneous hardware (Load balancing),
 and in this case it is convenient to use the same wisdom for every
 process.  Thus, we need a mechanism to synchronize the wisdom.
 
    To address both of these problems, FFTW provides the following two
 functions:
 
      void fftw_mpi_broadcast_wisdom(MPI_Comm comm);
      void fftw_mpi_gather_wisdom(MPI_Comm comm);
 
    Given a communicator 'comm', 'fftw_mpi_broadcast_wisdom' will
 broadcast the wisdom from process 0 to all other processes.  Conversely,
 'fftw_mpi_gather_wisdom' will collect wisdom from all processes onto
 process 0.  (If the plans created for the same problem by different
 processes are not the same, 'fftw_mpi_gather_wisdom' will arbitrarily
 choose one of the plans.)  Both of these functions may result in
 suboptimal plans for different processes if the processes are running on
 non-identical hardware.  Both of these functions are _collective_ calls,
 which means that they must be executed by all processes in the
 communicator.
 
    So, for example, a typical code snippet to import wisdom from a file
 and use it on all processes would be:
 
      {
          int rank;
 
          fftw_mpi_init();
          MPI_Comm_rank(MPI_COMM_WORLD, &rank);
          if (rank == 0) fftw_import_wisdom_from_filename("mywisdom");
          fftw_mpi_broadcast_wisdom(MPI_COMM_WORLD);
      }
 
    (Note that we must call 'fftw_mpi_init' before importing any wisdom
 that might contain MPI plans.)  Similarly, a typical code snippet to
 export wisdom from all processes to a file is:
 
      {
          int rank;
 
          fftw_mpi_gather_wisdom(MPI_COMM_WORLD);
          MPI_Comm_rank(MPI_COMM_WORLD, &rank);
          if (rank == 0) fftw_export_wisdom_to_filename("mywisdom");
      }
 
    ---------- Footnotes ----------
 
    (1) In fact, even this assumption is not technically guaranteed by
 the standard, although it seems to be universal in actual MPI
 implementations and is widely assumed by MPI-using software.
 Technically, you need to query the 'MPI_IO' attribute of
 'MPI_COMM_WORLD' with 'MPI_Attr_get'.  If this attribute is
 'MPI_PROC_NULL', no I/O is possible.  If it is 'MPI_ANY_SOURCE', any
 process can perform I/O. Otherwise, it is the rank of a process that can
 perform I/O ...  but since it is not guaranteed to yield the _same_ rank
 on all processes, you have to do an 'MPI_Allreduce' of some kind if you
 want all processes to agree about which is going to do I/O. And even
 then, the standard only guarantees that this process can perform output,
 but not input.  See e.g.  'Parallel Programming with MPI' by P. S.
 Pacheco, section 8.1.3.  Needless to say, in our experience virtually no
 MPI programmers worry about this.
Info Catalog
fftw3: FFTW MPI Transposes
fftw3: Distributed-memory FFTW with MPI
fftw3: Avoiding MPI Deadlocks