Changes between Version 259 and Version 260 of doc/tec/changelog_2018


Ignore:
Timestamp:
Sep 10, 2013 9:51:44 AM (11 years ago)
Author:
raasch
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • doc/tec/changelog_2018

    v259 v260  
    1111
    1212||='''Date'''  =||='''Author'''  =||='''svn\\Revision'''  =||='''Last\\Release'''  =||='''Type''' =||='''Description''' =||
     13|----------------
     14{{{#!td style="vertical-align:top;width: 50px"
     1510/09/13
     16}}}
     17{{{#!td style="vertical-align:top;width: 50px"
     18SR
     19}}}
     20{{{#!td style="vertical-align:top;width: 75px"
     21r1221
     22}}}
     23{{{#!td style="vertical-align:top"
     243.9
     25}}}
     26{{{#!td style="vertical-align:top"
     27N, C, B
     28}}}
     29{{{#!td style="vertical-align:top"
     30'''New:'''\\
     31openACC porting of reduction operations. An accelerator-version for {{{flow_statistics}}} with modified loop structure k,i,j has been implemented. It is activated with preprocessor flag {{{-D__openacc}}}. The extra accelerator version is required because so far, the openACC standard only allows reduction operations on simple scalars. Since 1D-vectors along k are used in flow_statistics, they had to be replaced by scalars and the k loop has now to be used as the outermost loop. Additional 3D-flag arrays have been introduced for replacing the 2D-index arrays {{{nzb_s_inner}}} and {{{nzb_diff_s_inner}}} in routines {{{pres}}} and {{{flow_statistics}}}. Respective "global-sum" loops  are running from {{{k = nzb}}}. Within the loops, values for grid points below the surface (topography) are multiplied by zero, all others by one, using the flag array {{{rflags_invers}}}. This array is dimensioned (j,i,k) to allow for better cache usage in the loops of the accelerator version of {{{flow_statistics}}}.
     32(flow_statistics, init_grid, init_3d_model, modules, palm, pres, time_integration)
     33
     34
     35'''Changed:'''\\
     36For PGI/openACC performance reasons (PGI compiler version 13.6, CUDA 5.0) the default compile options have been set to "{{{-ta=nocache}}}", which gives a speed-up of about 10-20%. For the same reason, the environment variable {{{PGI_ACC_SYNCHRONOUS}}} is set to 1 in the simple run script, which significantly improves the performance about 80%.
     37(MAKE.inc.pgi.openacc, palm_simple_run)
     38
     39The type of flag array {{{wall_flags_0}}}, used in the Wicker-Skamarock scheme for advection of the vertical wind component, has been changed to 32bit {{{INTEGER}}}. An additional array {{{wall_flags_00}}} has been introduced to hold flag bits 32-63. This is required because the former used {{{KIND = SELECTED_INT_KIND(11)}}} caused wrong results with openACC.
     40(advec_ws, init_grid, modules, palm)
     41
     42'''Bugfix:'''\\
     43Dummy argument {{{tri}}} in 1d-routines replaced by {{{tri_for_1d}}} because of name conflict with array {{{tri}}} in module {{{arrays_3d}}}. (tridia_solver)
     44}}}
    1345|----------------
    1446{{{#!td style="vertical-align:top;width: 50px"