fftw3: Introduction

 
 1 Introduction
 **************
 
 This manual documents version 3.3.6-pl2 of FFTW, the _Fastest Fourier
 Transform in the West_.  FFTW is a comprehensive collection of fast C
 routines for computing the discrete Fourier transform (DFT) and various
 special cases thereof.
    * FFTW computes the DFT of complex data, real data, even- or
      odd-symmetric real data (these symmetric transforms are usually
      known as the discrete cosine or sine transform, respectively), and
      the discrete Hartley transform (DHT) of real data.
 
    * The input data can have arbitrary length.  FFTW employs O(n log n)
      algorithms for all lengths, including prime numbers.
 
    * FFTW supports arbitrary multi-dimensional data.
 
    * FFTW supports the SSE, SSE2, AVX, AVX2, AVX512, KCVI, Altivec, VSX,
      and NEON vector instruction sets.
 
    * FFTW includes parallel (multi-threaded) transforms for
      shared-memory systems.
    * Starting with version 3.3, FFTW includes distributed-memory
      parallel transforms using MPI.
 
    We assume herein that you are familiar with the properties and uses
 of the DFT that are relevant to your application.  Otherwise, see e.g.
 'The Fast Fourier Transform and Its Applications' by E. O. Brigham
 (Prentice-Hall, Englewood Cliffs, NJ, 1988).  Our web page
 (http://www.fftw.org) also has links to FFT-related information online.
 
    In order to use FFTW effectively, you need to learn one basic concept
 of FFTW's internal structure: FFTW does not use a fixed algorithm for
 computing the transform, but instead it adapts the DFT algorithm to
 details of the underlying hardware in order to maximize performance.
 Hence, the computation of the transform is split into two phases.
 First, FFTW's "planner" "learns" the fastest way to compute the
 transform on your machine.  The planner produces a data structure called
 a "plan" that contains this information.  Subsequently, the plan is
 "executed" to transform the array of input data as dictated by the plan.
 The plan can be reused as many times as needed.  In typical
 high-performance applications, many transforms of the same size are
 computed and, consequently, a relatively expensive initialization of
 this sort is acceptable.  On the other hand, if you need a single
 transform of a given size, the one-time cost of the planner becomes
 significant.  For this case, FFTW provides fast planners based on
 heuristics or on previously computed plans.
 
    FFTW supports transforms of data with arbitrary length, rank,
 multiplicity, and a general memory layout.  In simple cases, however,
 this generality may be unnecessary and confusing.  Consequently, we
 organized the interface to FFTW into three levels of increasing
 generality.
    * The "basic interface" computes a single transform of contiguous
      data.
    * The "advanced interface" computes transforms of multiple or strided
      arrays.
    * The "guru interface" supports the most general data layouts,
      multiplicities, and strides.
    We expect that most users will be best served by the basic interface,
 whereas the guru interface requires careful attention to the
 documentation to avoid problems.
 
    Besides the automatic performance adaptation performed by the
 planner, it is also possible for advanced users to customize FFTW
 manually.  For example, if code space is a concern, we provide a tool
 that links only the subset of FFTW needed by your application.
 Conversely, you may need to extend FFTW because the standard
 distribution is not sufficient for your needs.  For example, the
 standard FFTW distribution works most efficiently for arrays whose size
 can be factored into small primes (2, 3, 5, and 7), and otherwise it
 uses a slower general-purpose routine.  If you need efficient transforms
 of other sizes, you can use FFTW's code generator, which produces fast C
 programs ("codelets") for any particular array size you may care about.
 For example, if you need transforms of size 513 = 19 x 3^3, you can
 customize FFTW to support the factor 19 efficiently.
 
    For more information regarding FFTW, see the paper, "The Design and
 Implementation of FFTW3," by M. Frigo and S. G. Johnson, which was an
 invited paper in 'Proc. IEEE' 93 (2), p.  216 (2005).  The code
 generator is described in the paper "A fast Fourier transform compiler",
 by M. Frigo, in the 'Proceedings of the 1999 ACM SIGPLAN Conference on
 Programming Language Design and Implementation (PLDI), Atlanta, Georgia,
 May 1999'.  These papers, along with the latest version of FFTW, the
 FAQ, benchmarks, and other links, are available at the FFTW home page
 (http://www.fftw.org).
 
    The current version of FFTW incorporates many good ideas from the
 past thirty years of FFT literature.  In one way or another, FFTW uses
 the Cooley-Tukey algorithm, the prime factor algorithm, Rader's
 algorithm for prime sizes, and a split-radix algorithm (with a
 "conjugate-pair" variation pointed out to us by Dan Bernstein).  FFTW's
 code generator also produces new algorithms that we do not completely
 understand.  The reader is referred to the cited papers for the
 appropriate references.
 
    The rest of this manual is organized as follows.  We first discuss
 the sequential (single-processor) implementation.  We start by
 describing the basic interface/features of FFTW in SeeTutorial.
DONTPRINTYET  Next, SeeOther Important Topics discusses data alignment (*noteDONTPRINTYET  Next, SeeOther Important Topics discusses data alignment (See
 SIMD alignment and fftw_malloc), the storage scheme of
 multi-dimensional arrays (SeeMulti-dimensional Array Format), and
 FFTW's mechanism for storing plans on disk (SeeWords of Wisdom-Saving
 Plans).  Next, SeeFFTW Reference provides comprehensive
 documentation of all FFTW's features.  Parallel transforms are discussed
DONTPRINTYET  in their own chapters: SeeMulti-threaded FFTW and *noteDONTPRINTYET  in their own chapters: SeeMulti-threaded FFTW and See
 Distributed-memory FFTW with MPI.  Fortran programmers can also use
DONTPRINTYET  FFTW, as described in SeeCalling FFTW from Legacy Fortran and *noteDONTPRINTYET DONTPRINTYET  FFTW, as described in SeeCalling FFTW from Legacy Fortran and See
 Calling FFTW from Modern Fortran.  *noteInstallation and
 Customization:: explains how to install FFTW in your computer system and
 how to adapt FFTW to your needs.  License and copyright information is
 given in SeeLicense and Copyright.  Finally, we thank all the
 people who helped us in SeeAcknowledgments.