AccFFT
Functions
operators_gpu.cpp File Reference
#include <mpi.h>
#include <omp.h>
#include <iostream>
#include <cmath>
#include <math.h>
#include <string.h>
#include <cuda_runtime_api.h>
#include <accfft_gpu.h>
#include <../src/operators_gpu.txx>

Functions

void accfft_grad_gpu (double *A_x, double *A_y, double *A_z, double *A, accfft_plan_gpu *plan, std::bitset< 3 > *pXYZ, double *timer)
 
void accfft_laplace_gpu (double *LA, double *A, accfft_plan_gpu *plan, double *timer)
 
void accfft_divergence_gpu (double *divA, double *A_x, double *A_y, double *A_z, accfft_plan_gpu *plan, double *timer)
 
void accfft_biharmonic_gpu (double *BA, double *A, accfft_plan_gpu *plan, double *timer)
 

Detailed Description

CPU functions of AccFFT operators

Function Documentation

void accfft_biharmonic_gpu ( double *  BA,
double *  A,
accfft_plan_gpu *  plan,
double *  timer 
)

Computes double precision Biharmonic of its input real data A, and writes the output into LA. All the arrays must reside in the device (i.e. GPU) and must have been allocated with proper size using cudaMalloc.

Parameters
BA$\Delta A$
planFFT plan created by accfft_plan_dft_3d_r2c_gpu. Must be an outplace plan, otherwise the function will return without computing the gradient.
timerSee Timing AccFFT for more details.
void accfft_divergence_gpu ( double *  divA,
double *  A_x,
double *  A_y,
double *  A_z,
accfft_plan_gpu *  plan,
double *  timer 
)

Computes double precision divergence of its input vector data A_x, A_y, and A_x. The output data is written to divA. All the arrays must reside in the device (i.e. GPU) and must have been allocated with proper size using cudaMalloc.

Parameters
divA$\nabla\cdot(A_x i + A_y j+ A_z k)$
A_xThe x component of $\nabla A$
A_yThe y component of $\nabla A$
A_zThe z component of $\nabla A$
planFFT plan created by accfft_plan_dft_3d_r2c_gpu. Must be an outplace plan, otherwise the function will return without computing the gradient.
timerSee Timing AccFFT for more details.
void accfft_grad_gpu ( double *  A_x,
double *  A_y,
double *  A_z,
double *  A,
accfft_plan_gpu *  plan,
std::bitset< 3 > *  pXYZ,
double *  timer 
)

Computes double precision gradient of its input real data A, and returns the x, y, and z components and writes the output into A_x, A_y, and A_z respectively. All the arrays must reside in the device (i.e. GPU) and must have been allocated with proper size using cudaMalloc.

Parameters
A_xThe x component of $\nabla A$
A_yThe y component of $\nabla A$
A_zThe z component of $\nabla A$
planFFT plan created by accfft_plan_dft_3d_r2c_gpu. Must be an outplace plan, otherwise the function will return without computing the gradient.
pXYZa bit set pointer field of size 3 that determines which gradient components are needed. If XYZ={111} then all the components are computed and if XYZ={100}, then only the x component is computed. This can save the user some time, when just one or two of the gradient components are needed.
timerSee Timing AccFFT for more details.
void accfft_laplace_gpu ( double *  LA,
double *  A,
accfft_plan_gpu *  plan,
double *  timer 
)

Computes double precision Laplacian of its input real data A, and writes the output into LA. All the arrays must reside in the device (i.e. GPU) and must have been allocated with proper size using cudaMalloc.

Parameters
LA$\Delta A$
planFFT plan created by accfft_plan_dft_3d_r2c_gpu. Must be an outplace plan, otherwise the function will return without computing the gradient.
timerSee Timing AccFFT for more details.