Home

Context Navigation

pres.f90 @ 1111

Last change on this file since 1111 was 1111, checked in by raasch, 11 years ago

New:
---

GPU porting of pres, swap_timelevel. Adjustments of openACC directives.
Further porting of poisfft, which now runs completely on GPU without any
host/device data transfer for serial an parallel runs (but parallel runs
require data transfer before and after the MPI transpositions).
GPU-porting of tridiagonal solver:
tridiagonal routines split into extermal subroutines (instead using CONTAINS),
no distinction between parallel/non-parallel in poisfft and tridia any more,
tridia routines moved to end of file because of probable bug in PGI compiler
(otherwise "invalid device function" is indicated during runtime).
(cuda_fft_interfaces, fft_xy, flow_statistics, init_3d_model, palm, poisfft, pres, prognostic_equations, swap_timelevel, time_integration, transpose)
output of accelerator board information. (header)

optimization of tridia routines: constant elements and coefficients of tri are
stored in seperate arrays ddzuw and tric, last dimension of tri reduced from 5 to 2,
(init_grid, init_3d_model, modules, palm, poisfft)

poisfft_init is now called internally from poisfft,
(Makefile, Makefile_check, init_pegrid, poisfft, poisfft_hybrid)

CPU-time per grid point and timestep is output to CPU_MEASURES file
(cpu_statistics, modules, time_integration)

Changed:

resorting from/to array work changed, work now has 4 dimensions instead of 1 (transpose)
array diss allocated only if required (init_3d_model)

pressure boundary condition "Neumann+inhomo" removed from the code
(check_parameters, header, poisfft, poisfft_hybrid, pres)

Errors:

bugfix: dependency added for cuda_fft_interfaces (Makefile)
bugfix: CUDA fft plans adjusted for domain decomposition (before they always
used total domain) (fft_xy)

Property svn:keywords set to Id

File size: 21.3 KB

Rev	Line
[1]	1	SUBROUTINE pres
	2
[1036]	3	!--------------------------------------------------------------------------------!
	4	! This file is part of PALM.
	5	!
	6	! PALM is free software: you can redistribute it and/or modify it under the terms
	7	! of the GNU General Public License as published by the Free Software Foundation,
	8	! either version 3 of the License, or (at your option) any later version.
	9	!
	10	! PALM is distributed in the hope that it will be useful, but WITHOUT ANY
	11	! WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
	12	! A PARTICULAR PURPOSE. See the GNU General Public License for more details.
	13	!
	14	! You should have received a copy of the GNU General Public License along with
	15	! PALM. If not, see <http://www.gnu.org/licenses/>.
	16	!
	17	! Copyright 1997-2012 Leibniz University Hannover
	18	!--------------------------------------------------------------------------------!
	19	!
[484]	20	! Current revisions:
[1]	21	! -----------------
[1111]	22	! openACC statements added,
	23	! ibc_p_b = 2 removed
[708]	24	!
	25	! Former revisions:
	26	! -----------------
	27	! $Id: pres.f90 1111 2013-03-08 23:54:10Z raasch $
	28	!
[1093]	29	! 1092 2013-02-02 11:24:22Z raasch
	30	! unused variables removed
	31	!
[1037]	32	! 1036 2012-10-22 13:43:42Z raasch
	33	! code put under GPL (PALM 3.9)
	34	!
[1004]	35	! 1003 2012-09-14 14:35:53Z raasch
	36	! adjustment of array tend for cases with unequal subdomain sizes removed
	37	!
[779]	38	! 778 2011-11-07 14:18:25Z fricke
	39	! New allocation of tend when multigrid is used and the collected field on PE0
	40	! has more grid points than the subdomain of an PE.
	41	!
[720]	42	! 719 2011-04-06 13:05:23Z gryschka
	43	! Bugfix in volume flow control for double cyclic boundary conditions
	44	!
[710]	45	! 709 2011-03-30 09:31:40Z raasch
	46	! formatting adjustments
	47	!
[708]	48	! 707 2011-03-29 11:39:40Z raasch
[707]	49	! Calculation of weighted average of p is now handled in the same way
	50	! regardless of the number of ghost layers (advection scheme),
	51	! multigrid and sor method are using p_loc for iterative advancements of
	52	! pressure,
	53	! localsum calculation modified for proper OpenMP reduction,
	54	! bc_lr/ns replaced by bc_lr/ns_cyc
[674]	55	!
[694]	56	! 693 2011-03-08 09:..:..Z raasch
[695]	57	! bugfix: weighting coefficient added to ibm branch
[694]	58	!
	59	! 680 2011-02-04 23:16:06Z gryschka
[681]	60	! bugfix: collective_wait
[668]	61	!
[676]	62	! 675 2011-01-19 10:56:55Z suehring
	63	! Removed bugfix while copying tend.
	64	!
[674]	65	! 673 2011-01-18 16:19:48Z suehring
	66	! Weighting coefficients added for right computation of the pressure during
	67	! Runge-Kutta substeps.
	68	!
[668]	69	! 667 2010-12-23 12:06:00Z suehring/gryschka
[667]	70	! New allocation of tend when ws-scheme and multigrid is used. This is due to
	71	! reasons of perforance of the data_exchange. The same is done with p after
	72	! poismg is called.
	73	! nxl-1, nxr+1, nys-1, nyn+1 replaced by nxlg, nxrg, nysg, nyng when no
	74	! multigrid is used. Calls of exchange_horiz are modified.
	75	! bugfix: After pressure correction no volume flow correction in case of
	76	! non-cyclic boundary conditions
	77	! (has to be done only before pressure correction)
	78	! Call of SOR routine is referenced with ddzu_pres.
	79	!
[623]	80	! 622 2010-12-10 08:08:13Z raasch
	81	! optional barriers included in order to speed up collective operations
	82	!
[198]	83	! 151 2008-03-07 13:42:18Z raasch
	84	! Bugfix in volume flow control for non-cyclic boundary conditions
	85	!
[110]	86	! 106 2007-08-16 14:30:26Z raasch
	87	! Volume flow conservation added for the remaining three outflow boundaries
	88	!
[90]	89	! 85 2007-05-11 09:35:14Z raasch
	90	! Division through dt_3d replaced by multiplication of the inverse.
	91	! For performance optimisation, this is done in the loop calculating the
	92	! divergence instead of using a seperate loop.
	93	!
[77]	94	! 75 2007-03-22 09:54:05Z raasch
[75]	95	! Volume flow control for non-cyclic boundary conditions added (currently only
[76]	96	! for the north boundary!!), 2nd+3rd argument removed from exchange horiz,
	97	! mean vertical velocity is removed in case of Neumann boundary conditions
	98	! both at the bottom and the top
[1]	99	!
[3]	100	! RCS Log replace by Id keyword, revision history cleaned up
	101	!
[1]	102	! Revision 1.25 2006/04/26 13:26:12 raasch
	103	! OpenMP optimization (+localsum, threadsum)
	104	!
	105	! Revision 1.1 1997/07/24 11:24:44 raasch
	106	! Initial revision
	107	!
	108	!
	109	! Description:
	110	! ------------
	111	! Compute the divergence of the provisional velocity field. Solve the Poisson
	112	! equation for the perturbation pressure. Compute the final velocities using
	113	! this perturbation pressure. Compute the remaining divergence.
	114	!------------------------------------------------------------------------------!
	115
	116	USE arrays_3d
	117	USE constants
	118	USE control_parameters
	119	USE cpulog
	120	USE grid_variables
	121	USE indices
	122	USE interfaces
	123	USE pegrid
	124	USE poisfft_mod
	125	USE poisfft_hybrid_mod
	126	USE statistics
	127
	128	IMPLICIT NONE
	129
[1092]	130	INTEGER :: i, j, k
[1]	131
[673]	132	REAL :: ddt_3d, localsum, threadsum, d_weight_pres
[1]	133
	134	REAL, DIMENSION(1:2) :: volume_flow_l, volume_flow_offset
[76]	135	REAL, DIMENSION(1:nzt) :: w_l, w_l_l
[1]	136
	137
	138	CALL cpu_log( log_point(8), 'pres', 'start' )
	139
[85]	140
	141	ddt_3d = 1.0 / dt_3d
[709]	142	d_weight_pres = 1.0 / weight_pres(intermediate_timestep_count)
[85]	143
[1]	144	!
[707]	145	!-- Multigrid method expects array d to have one ghost layer.
	146	!--
[1]	147	IF ( psolver == 'multigrid' ) THEN
[667]	148
[1]	149	DEALLOCATE( d )
[667]	150	ALLOCATE( d(nzb:nzt+1,nys-1:nyn+1,nxl-1:nxr+1) )
[707]	151
	152	!
	153	!-- Since p is later used to hold the weighted average of the substeps, it
	154	!-- cannot be used in the iterative solver. Therefore, its initial value is
	155	!-- stored on p_loc, which is then iteratively advanced in every substep.
	156	IF ( intermediate_timestep_count == 1 ) THEN
	157	DO i = nxl-1, nxr+1
	158	DO j = nys-1, nyn+1
	159	DO k = nzb, nzt+1
	160	p_loc(k,j,i) = p(k,j,i)
	161	ENDDO
	162	ENDDO
	163	ENDDO
[667]	164	ENDIF
	165
[707]	166	ELSEIF ( psolver == 'sor' .AND. intermediate_timestep_count == 1 ) THEN
	167
	168	!
	169	!-- Since p is later used to hold the weighted average of the substeps, it
	170	!-- cannot be used in the iterative solver. Therefore, its initial value is
	171	!-- stored on p_loc, which is then iteratively advanced in every substep.
	172	p_loc = p
	173
[1]	174	ENDIF
	175
	176	!
[75]	177	!-- Conserve the volume flow at the outflow in case of non-cyclic lateral
	178	!-- boundary conditions
[106]	179	!-- WARNING: so far, this conservation does not work at the left/south
	180	!-- boundary if the topography at the inflow differs from that at the
	181	!-- outflow! For this case, volume_flow_area needs adjustment!
	182	!
	183	!-- Left/right
[709]	184	IF ( conserve_volume_flow .AND. ( outflow_l .OR. outflow_r ) ) THEN
[680]	185
[106]	186	volume_flow(1) = 0.0
	187	volume_flow_l(1) = 0.0
	188
	189	IF ( outflow_l ) THEN
	190	i = 0
	191	ELSEIF ( outflow_r ) THEN
	192	i = nx+1
	193	ENDIF
	194
	195	DO j = nys, nyn
	196	!
	197	!-- Sum up the volume flow through the south/north boundary
[709]	198	DO k = nzb_2d(j,i)+1, nzt
[667]	199	volume_flow_l(1) = volume_flow_l(1) + u(k,j,i) * dzw(k)
[106]	200	ENDDO
	201	ENDDO
	202
	203	#if defined( __parallel )
[680]	204	IF ( collective_wait ) CALL MPI_BARRIER( comm1dy, ierr )
[106]	205	CALL MPI_ALLREDUCE( volume_flow_l(1), volume_flow(1), 1, MPI_REAL, &
	206	MPI_SUM, comm1dy, ierr )
	207	#else
	208	volume_flow = volume_flow_l
	209	#endif
[709]	210	volume_flow_offset(1) = ( volume_flow_initial(1) - volume_flow(1) ) &
[106]	211	/ volume_flow_area(1)
	212
[667]	213	DO j = nysg, nyng
[709]	214	DO k = nzb_2d(j,i)+1, nzt
[106]	215	u(k,j,i) = u(k,j,i) + volume_flow_offset(1)
	216	ENDDO
	217	ENDDO
	218
	219	ENDIF
	220
	221	!
	222	!-- South/north
[709]	223	IF ( conserve_volume_flow .AND. ( outflow_n .OR. outflow_s ) ) THEN
[106]	224
[75]	225	volume_flow(2) = 0.0
	226	volume_flow_l(2) = 0.0
	227
[106]	228	IF ( outflow_s ) THEN
	229	j = 0
	230	ELSEIF ( outflow_n ) THEN
[75]	231	j = ny+1
[106]	232	ENDIF
	233
	234	DO i = nxl, nxr
[75]	235	!
[106]	236	!-- Sum up the volume flow through the south/north boundary
[709]	237	DO k = nzb_2d(j,i)+1, nzt
[667]	238	volume_flow_l(2) = volume_flow_l(2) + v(k,j,i) * dzw(k)
[75]	239	ENDDO
[106]	240	ENDDO
	241
[75]	242	#if defined( __parallel )
[680]	243	IF ( collective_wait ) CALL MPI_BARRIER( comm1dx, ierr )
[75]	244	CALL MPI_ALLREDUCE( volume_flow_l(2), volume_flow(2), 1, MPI_REAL, &
	245	MPI_SUM, comm1dx, ierr )
	246	#else
	247	volume_flow = volume_flow_l
	248	#endif
	249	volume_flow_offset(2) = ( volume_flow_initial(2) - volume_flow(2) ) &
[106]	250	/ volume_flow_area(2)
[75]	251
[667]	252	DO i = nxlg, nxrg
[709]	253	DO k = nzb_v_inner(j,i)+1, nzt
[106]	254	v(k,j,i) = v(k,j,i) + volume_flow_offset(2)
[75]	255	ENDDO
[106]	256	ENDDO
[75]	257
	258	ENDIF
	259
[76]	260	!
	261	!-- Remove mean vertical velocity
	262	IF ( ibc_p_b == 1 .AND. ibc_p_t == 1 ) THEN
[709]	263	IF ( simulated_time > 0.0 ) THEN ! otherwise nzb_w_inner not yet known
[76]	264	w_l = 0.0; w_l_l = 0.0
	265	DO i = nxl, nxr
	266	DO j = nys, nyn
	267	DO k = nzb_w_inner(j,i)+1, nzt
	268	w_l_l(k) = w_l_l(k) + w(k,j,i)
	269	ENDDO
	270	ENDDO
	271	ENDDO
	272	#if defined( __parallel )
[622]	273	IF ( collective_wait ) CALL MPI_BARRIER( comm2d, ierr )
[709]	274	CALL MPI_ALLREDUCE( w_l_l(1), w_l(1), nzt, MPI_REAL, MPI_SUM, &
	275	comm2d, ierr )
[76]	276	#else
	277	w_l = w_l_l
	278	#endif
	279	DO k = 1, nzt
	280	w_l(k) = w_l(k) / ngp_2dh_outer(k,0)
	281	ENDDO
[667]	282	DO i = nxlg, nxrg
	283	DO j = nysg, nyng
[76]	284	DO k = nzb_w_inner(j,i)+1, nzt
	285	w(k,j,i) = w(k,j,i) - w_l(k)
	286	ENDDO
	287	ENDDO
	288	ENDDO
	289	ENDIF
	290	ENDIF
[75]	291
	292	!
[1]	293	!-- Compute the divergence of the provisional velocity field.
	294	CALL cpu_log( log_point_s(1), 'divergence', 'start' )
	295
	296	IF ( psolver == 'multigrid' ) THEN
	297	!$OMP PARALLEL DO SCHEDULE( STATIC )
	298	DO i = nxl-1, nxr+1
	299	DO j = nys-1, nyn+1
	300	DO k = nzb, nzt+1
	301	d(k,j,i) = 0.0
	302	ENDDO
	303	ENDDO
	304	ENDDO
	305	ELSE
	306	!$OMP PARALLEL DO SCHEDULE( STATIC )
[1111]	307	!$acc kernels present( d )
	308	!$acc loop
[1003]	309	DO i = nxl, nxr
	310	DO j = nys, nyn
[1111]	311	!$acc loop vector(32)
[1003]	312	DO k = nzb+1, nzt
[1]	313	d(k,j,i) = 0.0
	314	ENDDO
	315	ENDDO
	316	ENDDO
[1111]	317	!$acc end kernels
[1]	318	ENDIF
	319
	320	localsum = 0.0
	321	threadsum = 0.0
	322
	323	#if defined( __ibm )
	324	!$OMP PARALLEL PRIVATE (i,j,k) FIRSTPRIVATE(threadsum) REDUCTION(+:localsum)
	325	!$OMP DO SCHEDULE( STATIC )
	326	DO i = nxl, nxr
	327	DO j = nys, nyn
	328	DO k = nzb_s_inner(j,i)+1, nzt
[85]	329	d(k,j,i) = ( ( u(k,j,i+1) - u(k,j,i) ) * ddx + &
	330	( v(k,j+1,i) - v(k,j,i) ) * ddy + &
[673]	331	( w(k,j,i) - w(k-1,j,i) ) * ddzw(k) ) * ddt_3d &
	332	* d_weight_pres
[1]	333	ENDDO
	334	!
	335	!-- Compute possible PE-sum of divergences for flow_statistics
	336	DO k = nzb_s_inner(j,i)+1, nzt
	337	threadsum = threadsum + ABS( d(k,j,i) )
	338	ENDDO
	339
	340	ENDDO
	341	ENDDO
	342
[707]	343	localsum = localsum + threadsum * dt_3d * &
	344	weight_pres(intermediate_timestep_count)
[693]	345
[1]	346	!$OMP END PARALLEL
	347	#else
	348
[1111]	349	!$OMP PARALLEL PRIVATE (i,j,k)
	350	!$OMP DO SCHEDULE( STATIC )
	351	!$acc kernels present( d, ddzw, nzb_s_inner, u, v, w )
	352	!$acc loop
	353	DO i = nxl, nxr
	354	DO j = nys, nyn
	355	!$acc loop vector(32)
	356	DO k = 1, nzt
	357	IF ( k > nzb_s_inner(j,i) ) THEN
[85]	358	d(k,j,i) = ( ( u(k,j,i+1) - u(k,j,i) ) * ddx + &
[1111]	359	( v(k,j+1,i) - v(k,j,i) ) * ddy + &
	360	( w(k,j,i) - w(k-1,j,i) ) * ddzw(k) ) * ddt_3d &
	361	* d_weight_pres
	362	ENDIF
[1]	363	ENDDO
	364	ENDDO
[1111]	365	ENDDO
	366	!$acc end kernels
	367	!$OMP END PARALLEL
[1]	368
	369	!
	370	!-- Compute possible PE-sum of divergences for flow_statistics
	371	!$OMP PARALLEL PRIVATE (i,j,k) FIRSTPRIVATE(threadsum) REDUCTION(+:localsum)
	372	!$OMP DO SCHEDULE( STATIC )
	373	DO i = nxl, nxr
	374	DO j = nys, nyn
	375	DO k = nzb+1, nzt
	376	threadsum = threadsum + ABS( d(k,j,i) )
	377	ENDDO
	378	ENDDO
	379	ENDDO
[707]	380	localsum = localsum + threadsum * dt_3d * &
	381	weight_pres(intermediate_timestep_count)
[1]	382	!$OMP END PARALLEL
	383	#endif
	384
	385	!
	386	!-- For completeness, set the divergence sum of all statistic regions to those
	387	!-- of the total domain
	388	sums_divold_l(0:statistic_regions) = localsum
	389
	390	CALL cpu_log( log_point_s(1), 'divergence', 'stop' )
	391
	392	!
	393	!-- Compute the pressure perturbation solving the Poisson equation
	394	IF ( psolver(1:7) == 'poisfft' ) THEN
	395
	396	!
	397	!-- Solve Poisson equation via FFT and solution of tridiagonal matrices
	398	IF ( psolver == 'poisfft' ) THEN
	399	!
	400	!-- Solver for 2d-decomposition
	401	CALL poisfft( d, tend )
[1111]	402	!$acc update host( d )
[1]	403	ELSEIF ( psolver == 'poisfft_hybrid' ) THEN
	404	!
	405	!-- Solver for 1d-decomposition (using MPI and OpenMP).
	406	!-- The old hybrid-solver is still included here, as long as there
	407	!-- are some optimization problems in poisfft
	408	CALL poisfft_hybrid( d )
	409	ENDIF
	410
	411	!
	412	!-- Store computed perturbation pressure and set boundary condition in
	413	!-- z-direction
	414	!$OMP PARALLEL DO
	415	DO i = nxl, nxr
	416	DO j = nys, nyn
	417	DO k = nzb+1, nzt
	418	tend(k,j,i) = d(k,j,i)
	419	ENDDO
	420	ENDDO
	421	ENDDO
	422
	423	!
	424	!-- Bottom boundary:
	425	!-- This condition is only required for internal output. The pressure
	426	!-- gradient (dp(nzb+1)-dp(nzb))/dz is not used anywhere else.
	427	IF ( ibc_p_b == 1 ) THEN
	428	!
	429	!-- Neumann (dp/dz = 0)
	430	!$OMP PARALLEL DO
[667]	431	DO i = nxlg, nxrg
	432	DO j = nysg, nyng
[1]	433	tend(nzb_s_inner(j,i),j,i) = tend(nzb_s_inner(j,i)+1,j,i)
	434	ENDDO
	435	ENDDO
	436
	437	ELSE
	438	!
	439	!-- Dirichlet
	440	!$OMP PARALLEL DO
[667]	441	DO i = nxlg, nxrg
	442	DO j = nysg, nyng
[1]	443	tend(nzb_s_inner(j,i),j,i) = 0.0
	444	ENDDO
	445	ENDDO
	446
	447	ENDIF
	448
	449	!
	450	!-- Top boundary
	451	IF ( ibc_p_t == 1 ) THEN
	452	!
	453	!-- Neumann
	454	!$OMP PARALLEL DO
[667]	455	DO i = nxlg, nxrg
	456	DO j = nysg, nyng
[1]	457	tend(nzt+1,j,i) = tend(nzt,j,i)
	458	ENDDO
	459	ENDDO
	460
	461	ELSE
	462	!
	463	!-- Dirichlet
	464	!$OMP PARALLEL DO
[667]	465	DO i = nxlg, nxrg
	466	DO j = nysg, nyng
[1]	467	tend(nzt+1,j,i) = 0.0
	468	ENDDO
	469	ENDDO
	470
	471	ENDIF
	472
	473	!
	474	!-- Exchange boundaries for p
[667]	475	CALL exchange_horiz( tend, nbgp )
[1]	476
	477	ELSEIF ( psolver == 'sor' ) THEN
	478
	479	!
	480	!-- Solve Poisson equation for perturbation pressure using SOR-Red/Black
	481	!-- scheme
[707]	482	CALL sor( d, ddzu_pres, ddzw, p_loc )
	483	tend = p_loc
[1]	484
	485	ELSEIF ( psolver == 'multigrid' ) THEN
	486
	487	!
	488	!-- Solve Poisson equation for perturbation pressure using Multigrid scheme,
[667]	489	!-- array tend is used to store the residuals, logical exchange_mg is used
	490	!-- to discern data exchange in multigrid ( 1 ghostpoint ) and normal grid
	491	!-- ( nbgp ghost points ).
[778]	492
	493	!-- If the number of grid points of the gathered grid, which is collected
	494	!-- on PE0, is larger than the number of grid points of an PE, than array
	495	!-- tend will be enlarged.
	496	IF ( gathered_size > subdomain_size ) THEN
	497	DEALLOCATE( tend )
	498	ALLOCATE( tend(nzb:nzt_mg(mg_switch_to_pe0_level)+1,nys_mg( &
	499	mg_switch_to_pe0_level)-1:nyn_mg(mg_switch_to_pe0_level)+1,&
	500	nxl_mg(mg_switch_to_pe0_level)-1:nxr_mg( &
	501	mg_switch_to_pe0_level)+1) )
	502	ENDIF
	503
[1]	504	CALL poismg( tend )
[707]	505
[778]	506	IF ( gathered_size > subdomain_size ) THEN
	507	DEALLOCATE( tend )
	508	ALLOCATE( tend(nzb:nzt+1,nysg:nyng,nxlg:nxrg) )
	509	ENDIF
	510
[1]	511	!
	512	!-- Restore perturbation pressure on tend because this array is used
	513	!-- further below to correct the velocity fields
[707]	514	DO i = nxl-1, nxr+1
	515	DO j = nys-1, nyn+1
	516	DO k = nzb, nzt+1
	517	tend(k,j,i) = p_loc(k,j,i)
	518	ENDDO
	519	ENDDO
	520	ENDDO
[667]	521
[1]	522	ENDIF
	523
	524	!
[707]	525	!-- Store perturbation pressure on array p, used for pressure data output.
	526	!-- Ghost layers are added in the output routines (except sor-method: see below)
	527	IF ( intermediate_timestep_count == 1 ) THEN
	528	!$OMP PARALLEL PRIVATE (i,j,k)
	529	!$OMP DO
	530	DO i = nxl-1, nxr+1
	531	DO j = nys-1, nyn+1
	532	DO k = nzb, nzt+1
	533	p(k,j,i) = tend(k,j,i) * &
	534	weight_substep(intermediate_timestep_count)
[673]	535	ENDDO
[1]	536	ENDDO
[707]	537	ENDDO
	538	!$OMP END PARALLEL
	539
	540	ELSE
	541	!$OMP PARALLEL PRIVATE (i,j,k)
	542	!$OMP DO
	543	DO i = nxl-1, nxr+1
	544	DO j = nys-1, nyn+1
	545	DO k = nzb, nzt+1
	546	p(k,j,i) = p(k,j,i) + tend(k,j,i) * &
	547	weight_substep(intermediate_timestep_count)
[673]	548	ENDDO
	549	ENDDO
[707]	550	ENDDO
	551	!$OMP END PARALLEL
	552
	553	ENDIF
[673]	554
[707]	555	!
	556	!-- SOR-method needs ghost layers for the next timestep
	557	IF ( psolver == 'sor' ) CALL exchange_horiz( p, nbgp )
[682]	558
[1]	559	!
	560	!-- Correction of the provisional velocities with the current perturbation
	561	!-- pressure just computed
[1111]	562	!$acc update host( u, v, w )
[709]	563	IF ( conserve_volume_flow .AND. ( bc_lr_cyc .OR. bc_ns_cyc ) ) THEN
[1]	564	volume_flow_l(1) = 0.0
	565	volume_flow_l(2) = 0.0
	566	ENDIF
[707]	567
[1]	568	!$OMP PARALLEL PRIVATE (i,j,k)
	569	!$OMP DO
[673]	570	DO i = nxl, nxr
[1]	571	DO j = nys, nyn
	572	DO k = nzb_w_inner(j,i)+1, nzt
[707]	573	w(k,j,i) = w(k,j,i) - dt_3d * &
	574	( tend(k+1,j,i) - tend(k,j,i) ) * ddzu(k+1) * &
	575	weight_pres(intermediate_timestep_count)
[1]	576	ENDDO
	577	DO k = nzb_u_inner(j,i)+1, nzt
[673]	578	u(k,j,i) = u(k,j,i) - dt_3d * &
[707]	579	( tend(k,j,i) - tend(k,j,i-1) ) * ddx * &
	580	weight_pres(intermediate_timestep_count)
[1]	581	ENDDO
	582	DO k = nzb_v_inner(j,i)+1, nzt
[673]	583	v(k,j,i) = v(k,j,i) - dt_3d * &
[707]	584	( tend(k,j,i) - tend(k,j-1,i) ) * ddy * &
	585	weight_pres(intermediate_timestep_count)
[673]	586	ENDDO
[1]	587	!
	588	!-- Sum up the volume flow through the right and north boundary
[707]	589	IF ( conserve_volume_flow .AND. bc_lr_cyc .AND. bc_ns_cyc .AND. &
	590	i == nx ) THEN
[1]	591	!$OMP CRITICAL
	592	DO k = nzb_2d(j,i) + 1, nzt
[667]	593	volume_flow_l(1) = volume_flow_l(1) + u(k,j,i) * dzw(k)
[1]	594	ENDDO
	595	!$OMP END CRITICAL
	596	ENDIF
[707]	597	IF ( conserve_volume_flow .AND. bc_ns_cyc .AND. bc_lr_cyc .AND. &
	598	j == ny ) THEN
[1]	599	!$OMP CRITICAL
	600	DO k = nzb_2d(j,i) + 1, nzt
[667]	601	volume_flow_l(2) = volume_flow_l(2) + v(k,j,i) * dzw(k)
[1]	602	ENDDO
	603	!$OMP END CRITICAL
	604	ENDIF
	605
	606	ENDDO
	607	ENDDO
	608	!$OMP END PARALLEL
[673]	609
[1]	610	!
	611	!-- Conserve the volume flow
[707]	612	IF ( conserve_volume_flow .AND. ( bc_lr_cyc .AND. bc_ns_cyc ) ) THEN
[1]	613
	614	#if defined( __parallel )
[622]	615	IF ( collective_wait ) CALL MPI_BARRIER( comm2d, ierr )
[1]	616	CALL MPI_ALLREDUCE( volume_flow_l(1), volume_flow(1), 2, MPI_REAL, &
	617	MPI_SUM, comm2d, ierr )
	618	#else
	619	volume_flow = volume_flow_l
	620	#endif
	621
	622	volume_flow_offset = ( volume_flow_initial - volume_flow ) / &
	623	volume_flow_area
	624
	625	!$OMP PARALLEL PRIVATE (i,j,k)
	626	!$OMP DO
	627	DO i = nxl, nxr
	628	DO j = nys, nyn
[667]	629	DO k = nzb_u_inner(j,i) + 1, nzt
	630	u(k,j,i) = u(k,j,i) + volume_flow_offset(1)
[719]	631	ENDDO
	632	DO k = nzb_v_inner(j,i) + 1, nzt
[667]	633	v(k,j,i) = v(k,j,i) + volume_flow_offset(2)
	634	ENDDO
[1]	635	ENDDO
	636	ENDDO
[667]	637
[1]	638	!$OMP END PARALLEL
	639
	640	ENDIF
	641
	642	!
	643	!-- Exchange of boundaries for the velocities
[667]	644	CALL exchange_horiz( u, nbgp )
	645	CALL exchange_horiz( v, nbgp )
	646	CALL exchange_horiz( w, nbgp )
[1]	647
	648	!
	649	!-- Compute the divergence of the corrected velocity field,
	650	!-- a possible PE-sum is computed in flow_statistics
	651	CALL cpu_log( log_point_s(1), 'divergence', 'start' )
	652	sums_divnew_l = 0.0
	653
	654	!
	655	!-- d must be reset to zero because it can contain nonzero values below the
	656	!-- topography
	657	IF ( topography /= 'flat' ) d = 0.0
	658
	659	localsum = 0.0
	660	threadsum = 0.0
	661
	662	!$OMP PARALLEL PRIVATE (i,j,k) FIRSTPRIVATE(threadsum) REDUCTION(+:localsum)
	663	!$OMP DO SCHEDULE( STATIC )
	664	#if defined( __ibm )
	665	DO i = nxl, nxr
	666	DO j = nys, nyn
	667	DO k = nzb_s_inner(j,i)+1, nzt
	668	d(k,j,i) = ( u(k,j,i+1) - u(k,j,i) ) * ddx + &
	669	( v(k,j+1,i) - v(k,j,i) ) * ddy + &
	670	( w(k,j,i) - w(k-1,j,i) ) * ddzw(k)
	671	ENDDO
	672	DO k = nzb+1, nzt
	673	threadsum = threadsum + ABS( d(k,j,i) )
	674	ENDDO
	675	ENDDO
	676	ENDDO
	677	#else
	678	DO i = nxl, nxr
	679	DO j = nys, nyn
	680	DO k = nzb_s_inner(j,i)+1, nzt
	681	d(k,j,i) = ( u(k,j,i+1) - u(k,j,i) ) * ddx + &
	682	( v(k,j+1,i) - v(k,j,i) ) * ddy + &
	683	( w(k,j,i) - w(k-1,j,i) ) * ddzw(k)
	684	threadsum = threadsum + ABS( d(k,j,i) )
	685	ENDDO
	686	ENDDO
	687	ENDDO
	688	#endif
[667]	689
[1]	690	localsum = localsum + threadsum
	691	!$OMP END PARALLEL
	692
	693	!
	694	!-- For completeness, set the divergence sum of all statistic regions to those
	695	!-- of the total domain
	696	sums_divnew_l(0:statistic_regions) = localsum
	697
	698	CALL cpu_log( log_point_s(1), 'divergence', 'stop' )
	699
[1111]	700	!$acc update device( u, v, w )
	701
[1]	702	CALL cpu_log( log_point(8), 'pres', 'stop' )
[667]	703
[1]	704
	705
	706	END SUBROUTINE pres

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: palm/trunk/SOURCE/pres.f90 @ 1111

Download in other formats: