fftw3: Other Multi-dimensional Real-data MPI Transforms

 
 6.6 Other multi-dimensional Real-Data MPI Transforms
 ====================================================
 
 FFTW's MPI interface also supports multi-dimensional 'r2r' transforms of
 all kinds supported by the serial interface (e.g.  discrete cosine and
 sine transforms, discrete Hartley transforms, etc.).  Only
 multi-dimensional 'r2r' transforms, not one-dimensional transforms, are
 currently parallelized.
 
    These are used much like the multidimensional complex DFTs discussed
 above, except that the data is real rather than complex, and one needs
 to pass an r2r transform kind ('fftw_r2r_kind') for each dimension as in
 the serial FFTW (SeeMore DFTs of Real Data).
 
    For example, one might perform a two-dimensional L x M that is an
 REDFT10 (DCT-II) in the first dimension and an RODFT10 (DST-II) in the
 second dimension with code like:
 
          const ptrdiff_t L = ..., M = ...;
          fftw_plan plan;
          double *data;
          ptrdiff_t alloc_local, local_n0, local_0_start, i, j;
 
          /* get local data size and allocate */
          alloc_local = fftw_mpi_local_size_2d(L, M, MPI_COMM_WORLD,
                                               &local_n0, &local_0_start);
          data = fftw_alloc_real(alloc_local);
 
          /* create plan for in-place REDFT10 x RODFT10 */
          plan = fftw_mpi_plan_r2r_2d(L, M, data, data, MPI_COMM_WORLD,
                                      FFTW_REDFT10, FFTW_RODFT10, FFTW_MEASURE);
 
          /* initialize data to some function my_function(x,y) */
          for (i = 0; i < local_n0; ++i) for (j = 0; j < M; ++j)
             data[i*M + j] = my_function(local_0_start + i, j);
 
          /* compute transforms, in-place, as many times as desired */
          fftw_execute(plan);
 
          fftw_destroy_plan(plan);
 
    Notice that we use the same 'local_size' functions as we did for
 complex data, only now we interpret the sizes in terms of real rather
 than complex values, and correspondingly use 'fftw_alloc_real'.