Changes between Version 5 and Version 6 of doc/tec/gpu
- Timestamp:
- Mar 10, 2013 3:18:25 AM (12 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
doc/tec/gpu
v5 v6 5 5 * cyclic lateral boundary conditions 6 6 * no humidity / cloud physics 7 * no topography7 * no canopy model 8 8 * no Lagrangian particle model 9 9 … … 23 23 /home/raasch/current_version/JOBS/gputest/INPUT/gputest_p3d 24 24 }}} 25 Please note that {{{loop_optimization = 'acc'}}} and {{{fft_method = 'system-specific'}}} have to be set. Results of tests are stored in the respective {{{MONITORING}}} directory.25 Please note that {{{loop_optimization = 'acc'}}}, {{{psolver = 'poisfft'}}}, and {{{fft_method = 'system-specific'}}} have to be set. Results of tests are stored in the respective {{{MONITORING}}} directory. 26 26 27 27 '''Report on current activities:''' … … 37 37 Pressure solver (including the tridiagonal solver) has been almost completely ported. Still missing are calculations in pres. \\ 38 38 CUDA fft has been implemented. \\ 39 GPU can also been used in the single-core (non-MPI-parallel) version. 39 GPU can also been used in the single-core (non-MPI-parallel) version. 40 41 r1113 \\ 42 In single-core mode, lateral boundary conditions completely run on device. Most loops in {{{pres}}} ported. Vertical boundary conditions ({{{boundary_conds}}}) ported. 40 43 41 44 '''Results for 512x512x64 grid (time in micro-s per gridpoint and timestep):''' \\ … … 47 50 '''Next steps:''' 48 51 49 * testing the newest PGI 13.2 compiler version, porting of reduction operations (especially in flow_statistics), check the capability of parallel regions 50 * update ghost boundaries only, overlapping of update/MPI and computation 51 * remove host/device data transfer for the single-core version, still required for the cyclic boundary conditions, in order to run the code completely on one GPU 52 * testing the newest PGI 13.2 compiler version, porting of reduction operations ({{{timestep}}}, {{{flow_statistics}}}, divergence in {{{pres}}}), check the capability of parallel regions (can IF-constructs be removed from inner loops?) 53 * for MPI mode update ghost boundaries only, overlapping of update/MPI-transfer and computation 52 54 * overlapping communication in pressure solver (alltoall operations) 53 55 * porting of remaining things (averaging, I/O, etc.)