Version 6 (modified by knoop, 8 years ago) (diff) |
---|
Executing PALM on GPUs
Palm was first ported to GPUs in the course of its participation in the SPEC-ACCEL benchmark. This initial GPU-capable PALM version is r1257. A full GPU porting was conducted in the year 2016 and the source code is available here. A detailed discussion of the porting process and its outcome can be found in Knoop et al. (2016)
In order to compile PALM with e.g. the PGI 16.5 compiler for an NVidia Tesla GPU with CUDA 7.5 and OpenMPI 1.10.3, please use the following
- Preprocessor options: -Mpreprocess -DMPI_REAL=MPI_DOUBLE_PRECISION -DMPI_2REAL=MPI_2DOUBLE_PRECISION -D__parallel -D__nopointer -D__lc -D__cuda_fft
- Compiler flags: -fastsse -acc -ta=tesla -Minfo=acc -Mcuda
- Linker flags: -fastsse -acc -ta=tesla -Minfo=acc -Mcuda -lcufft
Don't forget to use the mpif90 wrapper of your MPI installation in order to correctly link your MPI libraries.