fftw3: Multi-threaded FFTW

 
 5 Multi-threaded FFTW
 *********************
 
 In this chapter we document the parallel FFTW routines for shared-memory
 parallel hardware.  These routines, which support parallel one- and
 multi-dimensional transforms of both real and complex data, are the
 easiest way to take advantage of multiple processors with FFTW. They
 work just like the corresponding uniprocessor transform routines, except
 that you have an extra initialization routine to call, and there is a
 routine to set the number of threads to employ.  Any program that uses
 the uniprocessor FFTW can therefore be trivially modified to use the
 multi-threaded FFTW.
 
    A shared-memory machine is one in which all CPUs can directly access
 the same main memory, and such machines are now common due to the
 ubiquity of multi-core CPUs.  FFTW's multi-threading support allows you
 to utilize these additional CPUs transparently from a single program.
 However, this does not necessarily translate into performance
 gains--when multiple threads/CPUs are employed, there is an overhead
 required for synchronization that may outweigh the computatational
 parallelism.  Therefore, you can only benefit from threads if your
 problem is sufficiently large.
 

Menu