{{draft}} OpenACC makes it relatively easy to offload vectorized code to accelerators such as GPUs, for example. Unlike [[CUDA]] and OpenCL where kernels need to be coded explicitly, OpenACC minimizes the amount of modifications to do on a serial or [[OpenMP]] code. The compiler converts the OpenACC code into a binary executable that can make use of accelerators. The performance of OpenACC codes can be similar to the one of a [[CUDA]] code, except that OpenACC requires less code development. = OpenACC directives = Similar to [[OpenMP]], OpenACC can convert a for loop into parallel code that would run on an accelerator. This can be achieved with compiler directives #pragma acc ... before structured blocks of code like, for example, a for loop. All supported pragma directives are described in the [https://www.openacc.org/specification OpenACC specification]. = Code examples = OpenACC can be used in [[Fortran]], [[C]] and [[C++]], which we illustrate here using a simple program that computes a decimal approximation to π based on a definite integral which is equal to arctan(1), i.e. π/4. {{File |name=pi.c |lang="C" |contents= #include const int vl = 512; const long long N = 2000000000; int main(int argc,char** argv) { double pi = 0.0f; long long i; #pragma acc parallel vector_length(vl) #pragma acc loop reduction(+:pi) for (i = 0; i < N; i++) { double t = (double)((i + 0.5) / N); pi += 4.0 / (1.0 + t * t); } printf("pi = %11.10f\n", pi / N); return 0; } }} {{File |name=pi.cxx |lang="C++" |contents= #include #include const int vl = 512; const long long N = 2000000000; int main(int argc,char** argv) { double pi = 0.0f; long long i; #pragma acc parallel vector_length(vl) #pragma acc loop reduction(+:pi) for (i = 0; i < N; i++) { double t = double((i + 0.5) / N); pi += 4.0/(1.0 + t * t); } std::cout << std::fixed; std::cout << std::setprecision(10); std::cout << "pi = " << pi/double(N) << std::endl; return 0; } }} = Compilers = == PGI == * Module pgi, any version from 13.10 ** Newer versions support newest GPU capabilities. Compilation example: # TODO == GCC == * Module gcc, any version from 9.3.0 ** Newer versions support newest GPU capabilities. Compilation example: gcc -fopenacc -march=native -O3 pi.c -o pi = Tutorial = See our [[OpenACC_Tutorial]]. = References = * [https://www.openacc.org/sites/default/files/inline-images/Specification/OpenACC-3.1-final.pdf OpenACC official documentation - Specification 3.1 (PDF)] * [https://www.nvidia.com/docs/IO/116711/OpenACC-API.pdf NVIDIA OpenACC API - Quick Reference Guide (PDF)]