Home

Context Navigation

← Previous Change
Next Change →

.mrun.config.imuk_gpu

Timestamp:

Mar 8, 2013 11:54:10 PM (12 years ago)

Author:

raasch

Message:

New:
---

GPU porting of pres, swap_timelevel. Adjustments of openACC directives.
Further porting of poisfft, which now runs completely on GPU without any
host/device data transfer for serial an parallel runs (but parallel runs
require data transfer before and after the MPI transpositions).
GPU-porting of tridiagonal solver:
tridiagonal routines split into extermal subroutines (instead using CONTAINS),
no distinction between parallel/non-parallel in poisfft and tridia any more,
tridia routines moved to end of file because of probable bug in PGI compiler
(otherwise "invalid device function" is indicated during runtime).
(cuda_fft_interfaces, fft_xy, flow_statistics, init_3d_model, palm, poisfft, pres, prognostic_equations, swap_timelevel, time_integration, transpose)
output of accelerator board information. (header)

optimization of tridia routines: constant elements and coefficients of tri are
stored in seperate arrays ddzuw and tric, last dimension of tri reduced from 5 to 2,
(init_grid, init_3d_model, modules, palm, poisfft)

poisfft_init is now called internally from poisfft,
(Makefile, Makefile_check, init_pegrid, poisfft, poisfft_hybrid)

CPU-time per grid point and timestep is output to CPU_MEASURES file
(cpu_statistics, modules, time_integration)

Changed:

resorting from/to array work changed, work now has 4 dimensions instead of 1 (transpose)
array diss allocated only if required (init_3d_model)

pressure boundary condition "Neumann+inhomo" removed from the code
(check_parameters, header, poisfft, poisfft_hybrid, pres)

Errors:

bugfix: dependency added for cuda_fft_interfaces (Makefile)
bugfix: CUDA fft plans adjusted for domain decomposition (before they always
used total domain) (fft_xy)

File:

: 1 edited

palm/trunk/SCRIPTS/.mrun.config.imuk_gpu (modified) (3 diffs, 1 prop)

Legend:

: Unmodified
: Added
: Removed

palm/trunk/SCRIPTS/.mrun.config.imuk_gpu

Property svn:keywords set to Id

-                      r1016
+                      r1111
+#$Id$
 #column 1          column 2                                   column 3
 #name of variable  value of variable (~ must not be used)     scope
 …
 %add_source_path   $base_directory/USER_CODE/$fname
 %depository_path   $base_directory/MAKE_DEPOSITORY
-#%use_makefile      true
+#
-# Enter your own host below by adding another line containing in the second
-# column your hostname (as provided by the unix command "hostname") and in the
-# third column the host identifier. Depending on your operating system, the
-# first characters of the host identifier should be "lc" (Linux cluster), "ibm"
-# (IBM-AIX), or "nec" (NEC-SX), respectively.
+#
 %host_identifier   inferno      lcmuk
+#
+# version 27/09/2012
+#
+# pure MPI version
 %remote_username   <replace by your IMUK username>               lcmuk parallel pgi
 %tmp_user_catalog  /localdata                                    lcmuk parallel pgi
 …
 %lopts             -Mcray=pointer:-fastsse:-r8                   lcmuk parallel pgi
+#
+# pure MPI version with debug options
+%remote_username   <replace by your IMUK username>               lcmuk parallel pgidbg
+%tmp_user_catalog  /localdata                                    lcmuk parallel pgidbg
+%compiler_name     mpif90                                        lcmuk parallel pgidbg
+%compiler_name_ser pgf90                                         lcmuk parallel pgidbg
+%cpp_options       -Mpreprocess:-DMPI_REAL=MPI_DOUBLE_PRECISION:-DMPI_2REAL=MPI_2DOUBLE_PRECISION:-D__nopointer   lcmuk parallel pgidbg
+%mopts             -j:4                                          lcmuk parallel pgidbg
+%fopts             -Mcray=pointer:-O0:-C:-g:-Mbounds:-Mchkstk:-traceback:-r8   lcmuk parallel pgidbg
+%lopts             -Mcray=pointer:-O0:-C:-g:-Mbounds:-Mchkstk:-traceback:-r8   lcmuk parallel pgidbg
+#
+# pure GPU version
+%remote_username   <replace by your IMUK username>                       lcmuk pgigpu
+%tmp_user_catalog  /localdata                                            lcmuk pgigpu
+%compiler_name     pgf90                                                 lcmuk pgigpu
+%compiler_name_ser pgf90                                                 lcmuk pgigpu
+%cpp_options       -Mpreprocess:-D__nopointer:-D__openacc:-D__cuda_fft   lcmuk pgigpu
+%mopts             -j:4                                                  lcmuk pgigpu
+%fopts             -acc:-ta=nvidia,4.1:-Minfo=acc:-Mcray=pointer:-fastsse:-r8:-Mcuda    lcmuk pgigpu
+%lopts             -acc:-ta=nvidia,4.1:-Minfo=acc:-Mcray=pointer:-fastsse:-r8:-Mcuda:-L/localdata/opt/pgi/linux86-64/2012/cuda/4.1/lib64:-lcufft    lcmuk pgigpu
+#
+# MPI+GPU
 %remote_username   <replace by your IMUK username>               lcmuk parallel pgigpu
 %tmp_user_catalog  /localdata                                    lcmuk parallel pgigpu
 %compiler_name     mpif90                                        lcmuk parallel pgigpu
 %compiler_name_ser pgf90                                         lcmuk parallel pgigpu
 %cpp_options       -Mpreprocess:-DMPI_REAL=MPI_DOUBLE_PRECISION:-DMPI_2REAL=MPI_2DOUBLE_PRECISION:-D__nopointer:-D__openacc   lcmuk parallel pgigpu
+%cpp_options       -Mpreprocess:-DMPI_REAL=MPI_DOUBLE_PRECISION:-DMPI_2REAL=MPI_2DOUBLE_PRECISION:-D__nopointer:-D__openacc:-D__cuda_fft   lcmuk parallel pgigpu
 %mopts             -j:4                                          lcmuk parallel pgigpu
 %fopts             -acc:-ta=nvidia,4.1:-Minfo=acc:-Mcray=pointer:-fastsse:-r8        lcmuk parallel pgigpu
 %lopts             -acc:-ta=nvidia,4.1:-Minfo=acc:-Mcray=pointer:-fastsse:-r8        lcmuk parallel pgigpu
+%fopts             -acc:-ta=nvidia,4.1:-Minfo=acc:-Mcray=pointer:-fastsse:-r8:-Mcuda    lcmuk parallel pgigpu
+%lopts             -acc:-ta=nvidia,4.1:-Minfo=acc:-Mcray=pointer:-fastsse:-r8:-Mcuda:-L/localdata/opt/pgi/linux86-64/2012/cuda/4.1/lib64:-lcufft   lcmuk parallel pgigpu
+#
 %write_binary                true                             restart
+#

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 1111 for palm/trunk/SCRIPTS/.mrun.config.imuk_gpu

Legend:

palm/trunk/SCRIPTS/.mrun.config.imuk_gpu

Download in other formats: