8.1 Introduction and Motivation

The 3D FFT library provides an interface to do paralle FFT computation in a scalable fashion.

The parallelization is achieved by splitting the 3D transform into multiple phases. There are two possibilities for doing the splitting: One is dividing the data space (over which the fft is to be performed) into a set of slabs (figure 1). Each slab is essentially a collection of planes). First, 2D FFTs are done over the planes in the slab. Then a distributed 'transform' will send the data to destination so that fft in the third direction is performed. This approach takes two computation phases and one 'transform' phase. The second way for splitting is dividing the data into collections of pencils. First, 1D FFTs are computed over the pencils; then a 'transform' is performed and 1D FFTs are done over second dimention; again a 'transform' is performed and FFTs are computed over the last dimension. So this approach takes three computation phases and two 'transform' phases. In first approach, the parallelism is limited by the number of planes.While in second approach, it's limited by the number of pencils. So the second approach provides finer grained parallelism and it's possible to perform better when the number of processing units is larger than the number of planes.

June 29, 2008
Charm Homepage