fftw3: FFTW MPI Wisdom
6.8 FFTW MPI Wisdom
===================
FFTW's "wisdom" facility (Words of Wisdom-Saving Plans) can be
used to save MPI plans as well as to save uniprocessor plans. However,
for MPI there are several unavoidable complications.
First, the MPI standard does not guarantee that every process can
perform file I/O (at least, not using C stdio routines)--in general, we
may only assume that process 0 is capable of I/O.(1) So, if we want to
export the wisdom from a single process to a file, we must first export
the wisdom to a string, then send it to process 0, then write it to a
file.
Second, in principle we may want to have separate wisdom for every
process, since in general the processes may run on different hardware
even for a single MPI program. However, in practice FFTW's MPI code is
designed for the case of homogeneous hardware (Load balancing),
and in this case it is convenient to use the same wisdom for every
process. Thus, we need a mechanism to synchronize the wisdom.
To address both of these problems, FFTW provides the following two
functions:
void fftw_mpi_broadcast_wisdom(MPI_Comm comm);
void fftw_mpi_gather_wisdom(MPI_Comm comm);
Given a communicator 'comm', 'fftw_mpi_broadcast_wisdom' will
broadcast the wisdom from process 0 to all other processes. Conversely,
'fftw_mpi_gather_wisdom' will collect wisdom from all processes onto
process 0. (If the plans created for the same problem by different
processes are not the same, 'fftw_mpi_gather_wisdom' will arbitrarily
choose one of the plans.) Both of these functions may result in
suboptimal plans for different processes if the processes are running on
non-identical hardware. Both of these functions are _collective_ calls,
which means that they must be executed by all processes in the
communicator.
So, for example, a typical code snippet to import wisdom from a file
and use it on all processes would be:
{
int rank;
fftw_mpi_init();
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (rank == 0) fftw_import_wisdom_from_filename("mywisdom");
fftw_mpi_broadcast_wisdom(MPI_COMM_WORLD);
}
(Note that we must call 'fftw_mpi_init' before importing any wisdom
that might contain MPI plans.) Similarly, a typical code snippet to
export wisdom from all processes to a file is:
{
int rank;
fftw_mpi_gather_wisdom(MPI_COMM_WORLD);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (rank == 0) fftw_export_wisdom_to_filename("mywisdom");
}
---------- Footnotes ----------
(1) In fact, even this assumption is not technically guaranteed by
the standard, although it seems to be universal in actual MPI
implementations and is widely assumed by MPI-using software.
Technically, you need to query the 'MPI_IO' attribute of
'MPI_COMM_WORLD' with 'MPI_Attr_get'. If this attribute is
'MPI_PROC_NULL', no I/O is possible. If it is 'MPI_ANY_SOURCE', any
process can perform I/O. Otherwise, it is the rank of a process that can
perform I/O ... but since it is not guaranteed to yield the _same_ rank
on all processes, you have to do an 'MPI_Allreduce' of some kind if you
want all processes to agree about which is going to do I/O. And even
then, the standard only guarantees that this process can perform output,
but not input. See e.g. 'Parallel Programming with MPI' by P. S.
Pacheco, section 8.1.3. Needless to say, in our experience virtually no
MPI programmers worry about this.