source: palm/trunk/SOURCE/pres.f90 @ 1111

Last change on this file since 1111 was 1111, checked in by raasch, 9 years ago

New:
---

GPU porting of pres, swap_timelevel. Adjustments of openACC directives.
Further porting of poisfft, which now runs completely on GPU without any
host/device data transfer for serial an parallel runs (but parallel runs
require data transfer before and after the MPI transpositions).
GPU-porting of tridiagonal solver:
tridiagonal routines split into extermal subroutines (instead using CONTAINS),
no distinction between parallel/non-parallel in poisfft and tridia any more,
tridia routines moved to end of file because of probable bug in PGI compiler
(otherwise "invalid device function" is indicated during runtime).
(cuda_fft_interfaces, fft_xy, flow_statistics, init_3d_model, palm, poisfft, pres, prognostic_equations, swap_timelevel, time_integration, transpose)
output of accelerator board information. (header)

optimization of tridia routines: constant elements and coefficients of tri are
stored in seperate arrays ddzuw and tric, last dimension of tri reduced from 5 to 2,
(init_grid, init_3d_model, modules, palm, poisfft)

poisfft_init is now called internally from poisfft,
(Makefile, Makefile_check, init_pegrid, poisfft, poisfft_hybrid)

CPU-time per grid point and timestep is output to CPU_MEASURES file
(cpu_statistics, modules, time_integration)

Changed:


resorting from/to array work changed, work now has 4 dimensions instead of 1 (transpose)
array diss allocated only if required (init_3d_model)

pressure boundary condition "Neumann+inhomo" removed from the code
(check_parameters, header, poisfft, poisfft_hybrid, pres)

Errors:


bugfix: dependency added for cuda_fft_interfaces (Makefile)
bugfix: CUDA fft plans adjusted for domain decomposition (before they always
used total domain) (fft_xy)

  • Property svn:keywords set to Id
File size: 21.3 KB
Line 
1 SUBROUTINE pres
2
3!--------------------------------------------------------------------------------!
4! This file is part of PALM.
5!
6! PALM is free software: you can redistribute it and/or modify it under the terms
7! of the GNU General Public License as published by the Free Software Foundation,
8! either version 3 of the License, or (at your option) any later version.
9!
10! PALM is distributed in the hope that it will be useful, but WITHOUT ANY
11! WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
12! A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
13!
14! You should have received a copy of the GNU General Public License along with
15! PALM. If not, see <http://www.gnu.org/licenses/>.
16!
17! Copyright 1997-2012  Leibniz University Hannover
18!--------------------------------------------------------------------------------!
19!
20! Current revisions:
21! -----------------
22! openACC statements added,
23! ibc_p_b = 2 removed
24!
25! Former revisions:
26! -----------------
27! $Id: pres.f90 1111 2013-03-08 23:54:10Z raasch $
28!
29! 1092 2013-02-02 11:24:22Z raasch
30! unused variables removed
31!
32! 1036 2012-10-22 13:43:42Z raasch
33! code put under GPL (PALM 3.9)
34!
35! 1003 2012-09-14 14:35:53Z raasch
36! adjustment of array tend for cases with unequal subdomain sizes removed
37!
38! 778 2011-11-07 14:18:25Z fricke
39! New allocation of tend when multigrid is used and the collected field on PE0
40! has more grid points than the subdomain of an PE.
41!
42! 719 2011-04-06 13:05:23Z gryschka
43! Bugfix in volume flow control for double cyclic boundary conditions
44!
45! 709 2011-03-30 09:31:40Z raasch
46! formatting adjustments
47!
48! 707 2011-03-29 11:39:40Z raasch
49! Calculation of weighted average of p is now handled in the same way
50! regardless of the number of ghost layers (advection scheme),
51! multigrid and sor method are using p_loc for iterative advancements of
52! pressure,
53! localsum calculation modified for proper OpenMP reduction,
54! bc_lr/ns replaced by bc_lr/ns_cyc
55!
56! 693 2011-03-08 09:..:..Z raasch
57! bugfix: weighting coefficient added to ibm branch
58!
59! 680 2011-02-04 23:16:06Z gryschka
60! bugfix: collective_wait
61!
62! 675 2011-01-19 10:56:55Z suehring
63! Removed bugfix while copying tend.
64!
65! 673 2011-01-18 16:19:48Z suehring
66! Weighting coefficients added for right computation of the pressure during
67! Runge-Kutta substeps.
68!
69! 667 2010-12-23 12:06:00Z suehring/gryschka
70! New allocation of tend when ws-scheme and multigrid is used. This is due to
71! reasons of perforance of the data_exchange. The same is done with p after
72! poismg is called.
73! nxl-1, nxr+1, nys-1, nyn+1 replaced by nxlg, nxrg, nysg, nyng when no
74! multigrid is used. Calls of exchange_horiz are modified.
75! bugfix: After pressure correction no volume flow correction in case of
76! non-cyclic boundary conditions
77! (has to be done only before pressure correction)
78! Call of SOR routine is referenced with ddzu_pres.
79!
80! 622 2010-12-10 08:08:13Z raasch
81! optional barriers included in order to speed up collective operations
82!
83! 151 2008-03-07 13:42:18Z raasch
84! Bugfix in volume flow control for non-cyclic boundary conditions
85!
86! 106 2007-08-16 14:30:26Z raasch
87! Volume flow conservation added for the remaining three outflow boundaries
88!
89! 85 2007-05-11 09:35:14Z raasch
90! Division through dt_3d replaced by multiplication of the inverse.
91! For performance optimisation, this is done in the loop calculating the
92! divergence instead of using a seperate loop.
93!
94! 75 2007-03-22 09:54:05Z raasch
95! Volume flow control for non-cyclic boundary conditions added (currently only
96! for the north boundary!!), 2nd+3rd argument removed from exchange horiz,
97! mean vertical velocity is removed in case of Neumann boundary conditions
98! both at the bottom and the top
99!
100! RCS Log replace by Id keyword, revision history cleaned up
101!
102! Revision 1.25  2006/04/26 13:26:12  raasch
103! OpenMP optimization (+localsum, threadsum)
104!
105! Revision 1.1  1997/07/24 11:24:44  raasch
106! Initial revision
107!
108!
109! Description:
110! ------------
111! Compute the divergence of the provisional velocity field. Solve the Poisson
112! equation for the perturbation pressure. Compute the final velocities using
113! this perturbation pressure. Compute the remaining divergence.
114!------------------------------------------------------------------------------!
115
116    USE arrays_3d
117    USE constants
118    USE control_parameters
119    USE cpulog
120    USE grid_variables
121    USE indices
122    USE interfaces
123    USE pegrid
124    USE poisfft_mod
125    USE poisfft_hybrid_mod
126    USE statistics
127
128    IMPLICIT NONE
129
130    INTEGER ::  i, j, k
131
132    REAL    ::  ddt_3d, localsum, threadsum, d_weight_pres
133
134    REAL, DIMENSION(1:2) ::  volume_flow_l, volume_flow_offset
135    REAL, DIMENSION(1:nzt) ::  w_l, w_l_l
136
137
138    CALL cpu_log( log_point(8), 'pres', 'start' )
139
140
141    ddt_3d = 1.0 / dt_3d
142    d_weight_pres = 1.0 / weight_pres(intermediate_timestep_count)
143
144!
145!-- Multigrid method expects array d to have one ghost layer.
146!--
147    IF ( psolver == 'multigrid' )  THEN
148     
149       DEALLOCATE( d )
150       ALLOCATE( d(nzb:nzt+1,nys-1:nyn+1,nxl-1:nxr+1) ) 
151
152!
153!--    Since p is later used to hold the weighted average of the substeps, it
154!--    cannot be used in the iterative solver. Therefore, its initial value is
155!--    stored on p_loc, which is then iteratively advanced in every substep.
156       IF ( intermediate_timestep_count == 1 )  THEN
157          DO  i = nxl-1, nxr+1
158             DO  j = nys-1, nyn+1
159                DO  k = nzb, nzt+1
160                   p_loc(k,j,i) = p(k,j,i)
161                ENDDO
162             ENDDO
163          ENDDO
164       ENDIF
165       
166    ELSEIF ( psolver == 'sor'  .AND.  intermediate_timestep_count == 1 )  THEN
167
168!
169!--    Since p is later used to hold the weighted average of the substeps, it
170!--    cannot be used in the iterative solver. Therefore, its initial value is
171!--    stored on p_loc, which is then iteratively advanced in every substep.
172       p_loc = p
173
174    ENDIF
175
176!
177!-- Conserve the volume flow at the outflow in case of non-cyclic lateral
178!-- boundary conditions
179!-- WARNING: so far, this conservation does not work at the left/south
180!--          boundary if the topography at the inflow differs from that at the
181!--          outflow! For this case, volume_flow_area needs adjustment!
182!
183!-- Left/right
184    IF ( conserve_volume_flow  .AND.  ( outflow_l .OR. outflow_r ) )  THEN
185
186       volume_flow(1)   = 0.0
187       volume_flow_l(1) = 0.0
188
189       IF ( outflow_l )  THEN
190          i = 0
191       ELSEIF ( outflow_r )  THEN
192          i = nx+1
193       ENDIF
194
195       DO  j = nys, nyn
196!
197!--       Sum up the volume flow through the south/north boundary
198          DO  k = nzb_2d(j,i)+1, nzt
199             volume_flow_l(1) = volume_flow_l(1) + u(k,j,i) * dzw(k)
200          ENDDO
201       ENDDO
202
203#if defined( __parallel )   
204       IF ( collective_wait )  CALL MPI_BARRIER( comm1dy, ierr )
205       CALL MPI_ALLREDUCE( volume_flow_l(1), volume_flow(1), 1, MPI_REAL, &
206                           MPI_SUM, comm1dy, ierr )   
207#else
208       volume_flow = volume_flow_l 
209#endif
210       volume_flow_offset(1) = ( volume_flow_initial(1) - volume_flow(1) ) &
211                               / volume_flow_area(1)
212
213       DO  j = nysg, nyng
214          DO  k = nzb_2d(j,i)+1, nzt
215             u(k,j,i) = u(k,j,i) + volume_flow_offset(1)
216          ENDDO
217       ENDDO
218
219    ENDIF
220
221!
222!-- South/north
223    IF ( conserve_volume_flow  .AND.  ( outflow_n .OR. outflow_s ) )  THEN
224
225       volume_flow(2)   = 0.0
226       volume_flow_l(2) = 0.0
227
228       IF ( outflow_s )  THEN
229          j = 0
230       ELSEIF ( outflow_n )  THEN
231          j = ny+1
232       ENDIF
233
234       DO  i = nxl, nxr
235!
236!--       Sum up the volume flow through the south/north boundary
237          DO  k = nzb_2d(j,i)+1, nzt
238             volume_flow_l(2) = volume_flow_l(2) + v(k,j,i) * dzw(k)
239          ENDDO
240       ENDDO
241
242#if defined( __parallel )   
243       IF ( collective_wait )  CALL MPI_BARRIER( comm1dx, ierr )
244       CALL MPI_ALLREDUCE( volume_flow_l(2), volume_flow(2), 1, MPI_REAL, &
245                           MPI_SUM, comm1dx, ierr )   
246#else
247       volume_flow = volume_flow_l 
248#endif
249       volume_flow_offset(2) = ( volume_flow_initial(2) - volume_flow(2) )    &
250                               / volume_flow_area(2)
251
252       DO  i = nxlg, nxrg
253          DO  k = nzb_v_inner(j,i)+1, nzt
254             v(k,j,i) = v(k,j,i) + volume_flow_offset(2)
255          ENDDO
256       ENDDO
257
258    ENDIF
259
260!
261!-- Remove mean vertical velocity
262    IF ( ibc_p_b == 1  .AND.  ibc_p_t == 1 )  THEN
263       IF ( simulated_time > 0.0 )  THEN ! otherwise nzb_w_inner not yet known
264          w_l = 0.0;  w_l_l = 0.0
265          DO  i = nxl, nxr
266             DO  j = nys, nyn
267                DO  k = nzb_w_inner(j,i)+1, nzt
268                   w_l_l(k) = w_l_l(k) + w(k,j,i)
269                ENDDO
270             ENDDO
271          ENDDO
272#if defined( __parallel )   
273          IF ( collective_wait )  CALL MPI_BARRIER( comm2d, ierr )
274          CALL MPI_ALLREDUCE( w_l_l(1), w_l(1), nzt, MPI_REAL, MPI_SUM, &
275                              comm2d, ierr )
276#else
277          w_l = w_l_l 
278#endif
279          DO  k = 1, nzt
280             w_l(k) = w_l(k) / ngp_2dh_outer(k,0)
281          ENDDO
282          DO  i = nxlg, nxrg
283             DO  j = nysg, nyng
284                DO  k = nzb_w_inner(j,i)+1, nzt
285                   w(k,j,i) = w(k,j,i) - w_l(k)
286                ENDDO
287             ENDDO
288          ENDDO
289       ENDIF
290    ENDIF
291
292!
293!-- Compute the divergence of the provisional velocity field.
294    CALL cpu_log( log_point_s(1), 'divergence', 'start' )
295
296    IF ( psolver == 'multigrid' )  THEN
297       !$OMP PARALLEL DO SCHEDULE( STATIC )
298       DO  i = nxl-1, nxr+1
299          DO  j = nys-1, nyn+1
300             DO  k = nzb, nzt+1
301                d(k,j,i) = 0.0
302             ENDDO
303          ENDDO
304       ENDDO
305    ELSE
306       !$OMP PARALLEL DO SCHEDULE( STATIC )
307       !$acc kernels present( d )
308       !$acc loop
309       DO  i = nxl, nxr
310          DO  j = nys, nyn
311             !$acc loop vector(32)
312             DO  k = nzb+1, nzt
313                d(k,j,i) = 0.0
314             ENDDO
315          ENDDO
316       ENDDO
317       !$acc end kernels
318    ENDIF
319
320    localsum  = 0.0
321    threadsum = 0.0
322
323#if defined( __ibm )
324    !$OMP PARALLEL PRIVATE (i,j,k) FIRSTPRIVATE(threadsum) REDUCTION(+:localsum)
325    !$OMP DO SCHEDULE( STATIC )
326    DO  i = nxl, nxr
327       DO  j = nys, nyn
328          DO  k = nzb_s_inner(j,i)+1, nzt
329             d(k,j,i) = ( ( u(k,j,i+1) - u(k,j,i) ) * ddx + &
330                          ( v(k,j+1,i) - v(k,j,i) ) * ddy + &
331                          ( w(k,j,i) - w(k-1,j,i) ) * ddzw(k) ) * ddt_3d      &
332                                                                * d_weight_pres
333          ENDDO
334!
335!--       Compute possible PE-sum of divergences for flow_statistics
336          DO  k = nzb_s_inner(j,i)+1, nzt
337             threadsum = threadsum + ABS( d(k,j,i) )
338          ENDDO
339
340       ENDDO
341    ENDDO
342
343    localsum = localsum + threadsum * dt_3d * &
344                          weight_pres(intermediate_timestep_count)
345
346    !$OMP END PARALLEL
347#else
348
349    !$OMP PARALLEL PRIVATE (i,j,k)
350    !$OMP DO SCHEDULE( STATIC )
351    !$acc kernels present( d, ddzw, nzb_s_inner, u, v, w )
352    !$acc loop
353    DO  i = nxl, nxr
354       DO  j = nys, nyn
355          !$acc loop vector(32)
356          DO  k = 1, nzt
357             IF ( k > nzb_s_inner(j,i) )  THEN
358                d(k,j,i) = ( ( u(k,j,i+1) - u(k,j,i) ) * ddx + &
359                           ( v(k,j+1,i) - v(k,j,i) ) * ddy + &
360                           ( w(k,j,i) - w(k-1,j,i) ) * ddzw(k) ) * ddt_3d      &
361                           * d_weight_pres
362             ENDIF
363          ENDDO
364       ENDDO
365    ENDDO
366    !$acc end kernels
367    !$OMP END PARALLEL
368
369!
370!-- Compute possible PE-sum of divergences for flow_statistics
371    !$OMP PARALLEL PRIVATE (i,j,k) FIRSTPRIVATE(threadsum) REDUCTION(+:localsum)
372    !$OMP DO SCHEDULE( STATIC )
373    DO  i = nxl, nxr
374       DO  j = nys, nyn
375          DO  k = nzb+1, nzt
376             threadsum = threadsum + ABS( d(k,j,i) )
377          ENDDO
378       ENDDO
379    ENDDO
380    localsum = localsum + threadsum * dt_3d * &
381                          weight_pres(intermediate_timestep_count)
382    !$OMP END PARALLEL
383#endif
384
385!
386!-- For completeness, set the divergence sum of all statistic regions to those
387!-- of the total domain
388    sums_divold_l(0:statistic_regions) = localsum
389
390    CALL cpu_log( log_point_s(1), 'divergence', 'stop' )
391
392!
393!-- Compute the pressure perturbation solving the Poisson equation
394    IF ( psolver(1:7) == 'poisfft' )  THEN
395
396!
397!--    Solve Poisson equation via FFT and solution of tridiagonal matrices
398       IF ( psolver == 'poisfft' )  THEN
399!
400!--       Solver for 2d-decomposition
401          CALL poisfft( d, tend )
402          !$acc update host( d )
403       ELSEIF ( psolver == 'poisfft_hybrid' )  THEN 
404!
405!--       Solver for 1d-decomposition (using MPI and OpenMP).
406!--       The old hybrid-solver is still included here, as long as there
407!--       are some optimization problems in poisfft
408          CALL poisfft_hybrid( d )
409       ENDIF
410
411!
412!--    Store computed perturbation pressure and set boundary condition in
413!--    z-direction
414       !$OMP PARALLEL DO
415       DO  i = nxl, nxr
416          DO  j = nys, nyn
417             DO  k = nzb+1, nzt
418                tend(k,j,i) = d(k,j,i)
419             ENDDO
420          ENDDO
421       ENDDO
422
423!
424!--    Bottom boundary:
425!--    This condition is only required for internal output. The pressure
426!--    gradient (dp(nzb+1)-dp(nzb))/dz is not used anywhere else.
427       IF ( ibc_p_b == 1 )  THEN
428!
429!--       Neumann (dp/dz = 0)
430          !$OMP PARALLEL DO
431          DO  i = nxlg, nxrg
432             DO  j = nysg, nyng
433                tend(nzb_s_inner(j,i),j,i) = tend(nzb_s_inner(j,i)+1,j,i)
434             ENDDO
435          ENDDO
436
437       ELSE
438!
439!--       Dirichlet
440          !$OMP PARALLEL DO
441          DO  i = nxlg, nxrg
442             DO  j = nysg, nyng
443                tend(nzb_s_inner(j,i),j,i) = 0.0
444             ENDDO
445          ENDDO
446
447       ENDIF
448
449!
450!--    Top boundary
451       IF ( ibc_p_t == 1 )  THEN
452!
453!--       Neumann
454          !$OMP PARALLEL DO
455          DO  i = nxlg, nxrg
456             DO  j = nysg, nyng
457                tend(nzt+1,j,i) = tend(nzt,j,i)
458             ENDDO
459          ENDDO
460
461       ELSE
462!
463!--       Dirichlet
464          !$OMP PARALLEL DO
465          DO  i = nxlg, nxrg
466             DO  j = nysg, nyng
467                tend(nzt+1,j,i) = 0.0
468             ENDDO
469          ENDDO
470
471       ENDIF
472
473!
474!--    Exchange boundaries for p
475       CALL exchange_horiz( tend, nbgp )
476     
477    ELSEIF ( psolver == 'sor' )  THEN
478
479!
480!--    Solve Poisson equation for perturbation pressure using SOR-Red/Black
481!--    scheme
482       CALL sor( d, ddzu_pres, ddzw, p_loc )
483       tend = p_loc
484
485    ELSEIF ( psolver == 'multigrid' )  THEN
486
487!
488!--    Solve Poisson equation for perturbation pressure using Multigrid scheme,
489!--    array tend is used to store the residuals, logical exchange_mg is used
490!--    to discern data exchange in multigrid ( 1 ghostpoint ) and normal grid
491!--    ( nbgp ghost points ).
492
493!--    If the number of grid points of the gathered grid, which is collected
494!--    on PE0, is larger than the number of grid points of an PE, than array
495!--    tend will be enlarged.
496       IF ( gathered_size > subdomain_size )  THEN
497          DEALLOCATE( tend )
498          ALLOCATE( tend(nzb:nzt_mg(mg_switch_to_pe0_level)+1,nys_mg(          &
499                    mg_switch_to_pe0_level)-1:nyn_mg(mg_switch_to_pe0_level)+1,&
500                    nxl_mg(mg_switch_to_pe0_level)-1:nxr_mg(                   &
501                    mg_switch_to_pe0_level)+1) )
502       ENDIF
503
504       CALL poismg( tend )
505
506       IF ( gathered_size > subdomain_size )  THEN
507          DEALLOCATE( tend )
508          ALLOCATE( tend(nzb:nzt+1,nysg:nyng,nxlg:nxrg) )
509       ENDIF
510
511!
512!--    Restore perturbation pressure on tend because this array is used
513!--    further below to correct the velocity fields
514       DO  i = nxl-1, nxr+1
515          DO  j = nys-1, nyn+1
516             DO  k = nzb, nzt+1
517                tend(k,j,i) = p_loc(k,j,i)
518             ENDDO
519          ENDDO
520       ENDDO
521
522    ENDIF
523
524!
525!-- Store perturbation pressure on array p, used for pressure data output.
526!-- Ghost layers are added in the output routines (except sor-method: see below)
527    IF ( intermediate_timestep_count == 1 )  THEN
528       !$OMP PARALLEL PRIVATE (i,j,k)
529       !$OMP DO
530       DO  i = nxl-1, nxr+1
531          DO  j = nys-1, nyn+1
532             DO  k = nzb, nzt+1
533                p(k,j,i) = tend(k,j,i) * &
534                           weight_substep(intermediate_timestep_count)
535             ENDDO
536          ENDDO
537       ENDDO
538       !$OMP END PARALLEL
539
540    ELSE 
541       !$OMP PARALLEL PRIVATE (i,j,k)
542       !$OMP DO
543       DO  i = nxl-1, nxr+1
544          DO  j = nys-1, nyn+1
545             DO  k = nzb, nzt+1
546                p(k,j,i) = p(k,j,i) + tend(k,j,i) * &
547                           weight_substep(intermediate_timestep_count)
548             ENDDO
549          ENDDO
550       ENDDO
551       !$OMP END PARALLEL
552
553    ENDIF
554       
555!
556!-- SOR-method needs ghost layers for the next timestep
557    IF ( psolver == 'sor' )  CALL exchange_horiz( p, nbgp )
558
559!
560!-- Correction of the provisional velocities with the current perturbation
561!-- pressure just computed
562    !$acc update host( u, v, w )
563    IF ( conserve_volume_flow  .AND.  ( bc_lr_cyc .OR. bc_ns_cyc ) )  THEN
564       volume_flow_l(1) = 0.0
565       volume_flow_l(2) = 0.0
566    ENDIF
567
568    !$OMP PARALLEL PRIVATE (i,j,k)
569    !$OMP DO
570    DO  i = nxl, nxr   
571       DO  j = nys, nyn
572          DO  k = nzb_w_inner(j,i)+1, nzt
573             w(k,j,i) = w(k,j,i) - dt_3d *                                 &
574                        ( tend(k+1,j,i) - tend(k,j,i) ) * ddzu(k+1) *      &
575                        weight_pres(intermediate_timestep_count)
576          ENDDO
577          DO  k = nzb_u_inner(j,i)+1, nzt
578             u(k,j,i) = u(k,j,i) - dt_3d *                                 &
579                        ( tend(k,j,i) - tend(k,j,i-1) ) * ddx *            &
580                        weight_pres(intermediate_timestep_count)
581          ENDDO
582          DO  k = nzb_v_inner(j,i)+1, nzt
583             v(k,j,i) = v(k,j,i) - dt_3d *                                 &
584                        ( tend(k,j,i) - tend(k,j-1,i) ) * ddy *            &
585                        weight_pres(intermediate_timestep_count)
586          ENDDO                                                         
587!
588!--       Sum up the volume flow through the right and north boundary
589          IF ( conserve_volume_flow  .AND.  bc_lr_cyc  .AND.  bc_ns_cyc  .AND. &
590               i == nx )  THEN
591             !$OMP CRITICAL
592             DO  k = nzb_2d(j,i) + 1, nzt
593                volume_flow_l(1) = volume_flow_l(1) + u(k,j,i) * dzw(k)
594             ENDDO
595             !$OMP END CRITICAL
596          ENDIF
597          IF ( conserve_volume_flow  .AND.  bc_ns_cyc  .AND.  bc_lr_cyc  .AND. &
598               j == ny )  THEN
599             !$OMP CRITICAL
600             DO  k = nzb_2d(j,i) + 1, nzt
601                volume_flow_l(2) = volume_flow_l(2) + v(k,j,i) * dzw(k)
602             ENDDO
603             !$OMP END CRITICAL
604          ENDIF
605
606       ENDDO
607    ENDDO
608    !$OMP END PARALLEL
609   
610!
611!-- Conserve the volume flow
612    IF ( conserve_volume_flow  .AND.  ( bc_lr_cyc  .AND.  bc_ns_cyc ) )  THEN
613
614#if defined( __parallel )   
615       IF ( collective_wait )  CALL MPI_BARRIER( comm2d, ierr )
616       CALL MPI_ALLREDUCE( volume_flow_l(1), volume_flow(1), 2, MPI_REAL, &
617                           MPI_SUM, comm2d, ierr ) 
618#else
619       volume_flow = volume_flow_l 
620#endif   
621
622       volume_flow_offset = ( volume_flow_initial - volume_flow ) / &
623                            volume_flow_area
624
625       !$OMP PARALLEL PRIVATE (i,j,k)
626       !$OMP DO
627       DO  i = nxl, nxr
628          DO  j = nys, nyn
629             DO  k = nzb_u_inner(j,i) + 1, nzt
630                u(k,j,i) = u(k,j,i) + volume_flow_offset(1)
631             ENDDO
632             DO k = nzb_v_inner(j,i) + 1, nzt
633                v(k,j,i) = v(k,j,i) + volume_flow_offset(2)
634             ENDDO
635          ENDDO
636       ENDDO
637
638       !$OMP END PARALLEL
639
640    ENDIF
641
642!
643!-- Exchange of boundaries for the velocities
644    CALL exchange_horiz( u, nbgp )
645    CALL exchange_horiz( v, nbgp )
646    CALL exchange_horiz( w, nbgp )
647
648!
649!-- Compute the divergence of the corrected velocity field,
650!-- a possible PE-sum is computed in flow_statistics
651    CALL cpu_log( log_point_s(1), 'divergence', 'start' )
652    sums_divnew_l = 0.0
653
654!
655!-- d must be reset to zero because it can contain nonzero values below the
656!-- topography
657    IF ( topography /= 'flat' )  d = 0.0
658
659    localsum  = 0.0
660    threadsum = 0.0
661
662    !$OMP PARALLEL PRIVATE (i,j,k) FIRSTPRIVATE(threadsum) REDUCTION(+:localsum)
663    !$OMP DO SCHEDULE( STATIC )
664#if defined( __ibm )
665    DO  i = nxl, nxr
666       DO  j = nys, nyn
667          DO  k = nzb_s_inner(j,i)+1, nzt
668             d(k,j,i) = ( u(k,j,i+1) - u(k,j,i) ) * ddx + &
669                        ( v(k,j+1,i) - v(k,j,i) ) * ddy + &
670                        ( w(k,j,i) - w(k-1,j,i) ) * ddzw(k)
671          ENDDO
672          DO  k = nzb+1, nzt
673             threadsum = threadsum + ABS( d(k,j,i) )
674          ENDDO
675       ENDDO
676    ENDDO
677#else
678    DO  i = nxl, nxr
679       DO  j = nys, nyn
680          DO  k = nzb_s_inner(j,i)+1, nzt
681             d(k,j,i) = ( u(k,j,i+1) - u(k,j,i) ) * ddx + &
682                        ( v(k,j+1,i) - v(k,j,i) ) * ddy + &
683                        ( w(k,j,i) - w(k-1,j,i) ) * ddzw(k)
684             threadsum = threadsum + ABS( d(k,j,i) )
685          ENDDO
686       ENDDO
687    ENDDO
688#endif
689
690    localsum = localsum + threadsum
691    !$OMP END PARALLEL
692
693!
694!-- For completeness, set the divergence sum of all statistic regions to those
695!-- of the total domain
696    sums_divnew_l(0:statistic_regions) = localsum
697
698    CALL cpu_log( log_point_s(1), 'divergence', 'stop' )
699
700    !$acc update device( u, v, w )
701
702    CALL cpu_log( log_point(8), 'pres', 'stop' )
703   
704
705
706 END SUBROUTINE pres
Note: See TracBrowser for help on using the repository browser.