Step 4

Step 4 is contributed by Pierre Kestener. This step shows how to use AccFFT to solve a Poisson problem in parallel for both CPU and GPU. Pierre has also contributed an interface for writing/reading data using PNETCDF library.

Introduction

From molecular dynamics and quantum chemistry, to plasma physics and computational astrophysics, Poisson solvers are used in many applications in computational science and engineering. For a problem with periodic boundaries, Fourier methods are favorable due to their spectral accuracy and low computational cost to get such accuracy. Many of the interesting numerical problems involve billions of unknowns which certainly does not fit into shared memory. Therefore, one needs to compute distributed Fourier transforms. This is where AccFFT comes into the picture. This step shows you how to use AccFFT to solve the Poisson problem in parallel on both CPU and GPU. Moreover, you will see how Pierre’s utility function can be used to write the distributed data using PNETCDF library.

Poisson equation is defined as follows:

where $\rho$ can be thought of as the force distributed along a surface, and $\phi$ to be the resulting deformation. Now usually the force field is given and one needs to compute $\phi$, which is given by:

For a simple geometry applying the inverse of the Laplace operator is straightforward with spectral analysis. If we take the Fourier transform of both sides we will have:

Here the hats refer to the variable in frequency domain, $\omega$ is the wave number, and $\mathcal{F}^{-1}$ is the inverse Fourier transform. In 3D the same thing applies and $\phi$ can be computed as follows:

Note that $\omega$ cannot be zero. This is because one cannot determine the constant value in $\phi$ without using additional data such as boundary conditions. This is evident since any $\phi=\phi + constant$ satisfies the governing equation. For this reason we have to set $\frac{1}{\omega^2}=0$ in our numerical solver, and then calibrate the constant by using additional information.

So to recap:

• Compute FFT of $\rho$
• Apply Poisson Fourier filter of $\frac{-1}{\omega_x^2 +\omega_y^2+\omega_z^2}$
• Compute IFFT of the result

Code Explanation

The code is self explanatory and specially if you have read the previous steps you can easily understand what is going on by skimming thorough it. Here I want to comment on a new utility of function of AccFFT which you can use to write distributed data in a NetCDF file format for visualization. For example in the code you will find the following call to write the numerical result for $\phi$:

You need to pass the output file name, the istart offset and the size of the data that resides in each process (computed for you by calling accfft_local_size_dft_* function), the global sizes, and of course a pointer to the data itself. The output will be a NetCDF file that can be read by a myriad of visualization softwares including Paraview.

Results

There are three test cases in the code for the force function $\rho$:

• Test case 0: $\rho(x,y,z) = sin(2\pi x/L_x)sin(2\pi y/L_y)sin(2\pi z/L_z)$
• Test case 1: $\rho(x,y,z) = (4\alpha^2 (x^2+y^2+z^2)-6\alpha)e^{-\alpha(x^2+y^2+z^2)}$
• Test case 2: $% $

Test case 0, is a harmonic force, test case 1 is an exponential force which becomes sharper as $\alpha$ increases, and test case 2 is a discontinuous spherical force. You can specify the test case by passing “-t #” to the executable. The complete options are as follows:

• x,y,z: global domain sizes.
• t: testcase
• m: method
• a,b,c,r: uniform ball parameter (center location + radius)
• l: $\alpha$

Now the method option, m, is by default the method explained above. However, you can use a different approach (still using Fourier method). Instead of taking a Fourier transform from both sides of the Laplace operator, one can use Finite Differences to compute derivatives and then compute the FFT. Then from the orthogonality of the Fourier basis you can compute the Fourier coefficients of $\phi$. This method has a lower accuracy but it is nice to see how it works. For details please see section 19.1 and 19.4 of the numerical bible.

Alright, so on Maverick I get the following results:

And with the Finite Difference method:

As you can see the full spectral method gets a better accuracy in this case. I suggest you try other test cases and for example vary the radius of the discontinuous spherical force in test case 2, or vary $\alpha$ in test case 1 and compare the results.

If you have installed AccFFT with PNETCDF you should see three netCDF files in the step4 directory. These include the force term written to rho.nc, the exact phi written to phi_exact.nc, and the numerically computed phi written to phi.nc. If you have Paraview installed on your machine, you can load these directly by using “NetCDF files generic and CF conventions” plug-in.

Comparison with FEM and FMM

In case you were interested in how this problem can be solved with Finited Element Method or Fast Multipole Method, be sure to check out this paper. We have compared parallel performance and accuracy of these methods with the spectral one. Test case 0 and a variant of test case 1 are used with different exponential powers. The conclusion of the study is that for a target accuracy, the spectral method is faster with a large margin compared to FEM/FMM. This is true, unless the force term is extremely sharp. In that case, FEM/FMM may be faster because they can use adaptive discretization, while the classical FFT can only use uniform mesh.

Complete CPU Code

Here is the Poisson solver of the CPU code. The full code is available here.