Status quo

Current release

The release candidate of PALM-4U was released October 2018. Visit

The release candidate includes the following new components:

  • Multi Agent System
  • RANS mode (TKE-epsilon closure)
  • Indoor climate and energy demand module
  • Emission module
  • Aerosol physics/chemistry (SALSA)
  • RANS-LES and RANS-RANS nesting
  • Biometeorology output
  • Virtual measurement module
  • Graphical User Interface (GUI) for users from practice

Validation runs

VDI 3783 Part 9

Status message: completed

Description Date of issue Closing date
Validation completed 23/07/2019 23/07/2019

Validation protocol and details: here

Run 01 (VALM01): Winter 2017 Berlin, Jan 17 06:00 UTC - Jan 18 06:00 UTC

Status message: analyzing results

Description Date of issue Closing date Further remarks
Start of VALM01 testing 01/01/2019
... ...
Crash in nested runs. April-May ... Smaller test simulations run well.
Bug in radiation in a nested run 21/05/2019 22/05/2019
... ...
Fix numerical issues that lead to unrealistic concentrations of chemical compounds in case of (offline) nesting. July end of July
Memory demand for calculation of view factors (radiative transfer) 01/08/2019 03/08/2019 OOM killer aborted processes randomly. Using not all cores on node fixed this.
Parent and child grids do not overlap (required after revision of the nesting) 05/08/2019 05/08/2019 New child drivers are required as number of grid points changed.
Driver problem with bridges at the boundary. 05/08/2019 05/08/2019 Child was moved a few meter northward.
Bug in building parameters, wrong dimension in static input file 08/08/2019 08/08/2019
Bug when green roofs are present. 09/08/2019 10/10/2019 Green roofs where disabled in the simulation. Fixed now.
MPI network problems on the cray machine in Berlin 10/08/2019 19/8/2019 Recurring MPI failures at different locations in the code, appeared only in the large winter IOP simulations, not in smaller ones. As a consequence, all runs were carried-out on the Atos machine in Göttingen.
Minor bug in new implementation of external radiative forcing 21/08/2019 21/08/2019
Failure due to MPI errors. 05/09/2019 21/09/2019 Appeared also in smaller test simulations. Need to be fixed on HLRN side.
Emission module caused model crash 23/09/2019 27/09/2019
MPI error just at the beginning of the run - HLRN internal problem: ofi fabric is not available ... 24/09/2019 ? Error does not appear any more.
Crashed with error message corrupted double-linked list 01/10/2019 07/10/2019 Seems that scheduling is not working properly, had been queued for 4 days!!! Further debug messages implemented to narrow down the location. Message comes from the parent domain. However, memory consuming sky-view factors were calculated.
Crashed with An allocatable array is already allocated 09/10/2019 10/10/2019
Crashed by an MPI error 13/10/2019 Parent finished initialization. Crashes in a MPI_ALLGATHER call in surface_data_output_init. Might be connected to the HLRN-network problem (14/10/2019).
Crashed again with error message corrupted double-linked list in child simulation. 17/10/2019 Parent finished initialization. Crashes again in surface_data_output_init. Next step: switch-off surface-data output as this has no priority at the moment. Note, due to limited resources on HLRN site, the queuing times are quite long for simulations, sometimes several days.
Crashed with Floating divide by zero 23/10/2019 Error seems to be raised within routine drydepo_aero_zhang_vd. Error occurs after time stepping started (initialization finished). Further debugging for this error is ongoing.
Start child-only simulation 23/10/2019 Due to continuous errors within the nested simulation, a non-nested (child-only) simulation is started to get first results for evaluation. Simulation is still running (06/11/2019).
Finished child-only simulation 27/11/2019 Simulation crashed at 11:57:34.95UTC with input/output error. Data up to that point is saved.
Crashes by MPI_INIT 03/11/2019 Simulation crashed several time in MPI_INIT (environment problems)
Crash 03/12/2019 program abort due to check of surface_fractions, check was revised so that surface fractions can also be set at building grid points
Crash 06/12/2019 HDF5 Error - could not reproduced
Crash 08/12/2019 Floating invalid in advection for u-component at first timestep. Unfortunately, this error could not reproduced. Remark: Jobs were queued for about a week on HLRN due to too low capacities, so that investigations and bug tracing was delayed.
Parent simulation 23/12/2019 Proceed investigation on HLRN Berlin. Parent simulation runs for an hour, results looks plausible.
Nested simulation - numerical issues 27/12/2019 Nested simulation ran for 1 minute. However, large oscillation in the u- and v-component could be observed within the child. I hypothesize that this is due to the 3D-initialization of the child from the parent. Due to mismatches in the building configuration (due to the large grid aspect ratio), many grid points in the child remain zero after initialization, even though these grid points belong to the atmosphere. Since the mass-flux is largely affected by this, strong oscillations arise within the child, finally lead to a crash.
Nested simulation 02/01/2020 Simulations repeatedly hang / crash. The Lustre system in Berlin is still not full setup so that simulations repeatedly hang / crash due to slow filesystem.
Nested simulation - initial run 03/01/2020 Lustre filesystem issues seem to be solved for now. Initialization of the child has been changed. Child is now initialized via dynamic driver rather than via the coupler. This way all atmosphere grid points are initialized appropriately. The nested simulation is at t=30min. First estimate of duration: in 12 h real time on 6720 cores we will simulate about 1 h. With 30 hrs simulation time (00:00:00 UTC - 06:00:00 UTC, next day), we will need about 30 restarts. Since the machine in Berlin starts to fill up now with other users, we are only be able to do 1 simulation at a day (optimistic scenario), so this will take at least one month.
Nested simulation - restart run 09/01/2020 Simulation crashes in reading the restart data for one PE in the child.
Nested simulation - restart run 29/01/2020 After recurrent maintenance-related breaks on HLRN, restart simulation started again. Simulation alternately crashes either with a HDF 5 error in the parent or in reading the restart data. In the parent this happens while reading the Netcdf input data. At most of the ranks there is no problem with the NetCDF input, however, at some ranks the NF90_INQUIRE and NF90_INQUIRE_VARIABLE produces NetCDF error codes. In the child, the error is reproducible, even if the initial simulation is run again the problem occurs. This happens only at specific ranks. We will downscale the simulation to debug this more efficiently. (Un)fortunately these problems do not occur any more after HLRN runs more stable, so that the reason for these crashes cannot be traced back.
Nested simulation 06/02/2020 After several fixes on HLRN side, I started the whole simulation with debug prints again. Initial simulation did not show any problems. The following restart run also run fine, no problem with NF90_INQUIRE as well as with empty binary files. The second restart run is queued now. We are at t ~ 2940 s.
Nested simulation 10/02/2020 We are at 03:00 UTC. Model run crashed in biometeorology_mod at first timestep after restart. The crash could be traced back to a NaN in pt_av at a single grid point. All other quantities, including pt, look reasonable.
Nested simulation 27/02/2020 Simulation was started again. This time we reached 04:00 UTC. Simulation crashes now again after a restart in reading the array "surf_h(0)%end_index", where some unreasonable values occur. On all other processes values for this array look correct.
Nested simulation 12/03/2020 Simulation was started again. After several optimizations where made in the synthetic turbulence generator and some minor bugs were fixed, the simulation was started again. Berlin complex is under maintenance now.
Nested simulation 25/03/2020 Simulation was running until exactly 05:00UTC. Crashed by floating overflow in the child domain. Last restart time was at 04:55 UTC, flow fields, surface data look reasonable. Restarting from last restart step using traceback option and print statements revealed an floating overflow in output of averaged 3D variable 'theta' at grid point (k,j,i) = (97,117,968), which is far away from any building. Think this is also related to a restart problem where faulty data is read for pt_av. Proceeding without averaged data output worked.
Nested simulation 01/04/2020 Simulation has reached 06:05 UTC. At the moment we are out of computing time. The IOP has been started, i.e. measurements are output. However, it turned out that the unstructured output of the virtual measurements consumes far too much CPU time at the moment. With smaller number of processes in test simulations this did not become obvious, however, with large number of processes the probability that IO processes interfere with each other becomes higher so that the slowdown of IO becomes more pronounced. First we need to accelerate the output before we can proceed. Moreover, with further debugging the reason of restart failures could be most probably narrowed down to file-system issues rather than palm-internal problems (sending trouble ticket to the computing center).
Nested simulation 02/06/2020 Simulation has reached 06:40 UTC. Output issues are solved and we have CPU time again.
Nested simulation 25/06/2020 Simulation is at 01:51 UTC (2nd day). Simulation is stopped because we have run out of computing time.
Nested simulation 08/07/2020 Simulation is still at 01:51 UTC (2nd day). We got new computing time resources on 1st of July, but now the Lise system is down due to file system problems since several days, so jobs cannot be executed.
Nested simulation 17/07/2020 Simulation is at 02:10 UTC (2nd day). Data output on the Lise system is extremely slow since 13/07/2020, so that the progress made in a simulation is only 2-3 min (instead of 1 h compared to the situation before system maintenance).
Nested simulation 23/07/2020 Simulation is at 05:41 UTC (2nd day). Data output on the Lise system is gone since last reboot on 17/07/2020.
Nested simulation 23/07/2020 Simulation is at 06:00 UTC (2nd day) - finished. output files need to be concatenated

Run 02 (VALM02): Summer 2018 Berlin, Jul 16 06:00 UTC - Jul 18 06:00 UTC

Status message: simulation running

Description Date of issue Closing date Further remarks
Preparing input files for VALM02 30/01/2020
Dynamic driver 13/03/2020 17/04/2020Error in inifor prevents dynamic-driver creation: inifor: ERROR: PALM-4U grid extends above COSMO-DE model top.. Bug-fixing is in progress. DWD created a preliminary driver with which further testing can be done.
Dynamic driver 17/04/2020 04/05/2020Further errors in inifor prevents final dynamic-driver creation. Bugfixes in INIFOR and computer setup solved with help from DWD.
Setup creation 20/05/2020 Defining details of setup like domain height, boundary conditions, technical setup.
Nested simulation 06/08/2020 Simulation is now at 00:30 UTC (1st day) Problems during wall/soil spinup has been solved (timestep was too large). Instabilities during wall/soil spinup are investigated separately.
Nested simulation 12/08/2020 Simulation is at 02:58 UTC (1st day)
Error in rtm concerning svf calculation12/08/2020 22/08/2020 Due to changes in CPU layout, SVF needed to be re-calculated. During this step, MPI errors occurred because too much memory was required by the MPI calls. This is now solved by reducing the amount of view angles for SVF calculation.
Nested simulation 27/08/2020 Simulation is at 04:58 UTC (1st day)
Nested simulation 03/09/2020 01/10/2020 Concentration of chemical compounds reached unrealistic values. This was caused by an error within the offline nesting.
Nested simulation 01/10/2020 07/10/2020 Bug within the indoor model during spinup detected and fixed.
Nested simulation 07/10/2020 - Update of simulation setup after discussion with the evaluation working group and Module B. Now two child domains are simulated (ERP and Steglitz).
Nested simulation 14/10/2020 30/10/2020 Further bugs detected within the indoor model and the driver setup fixed.
Nested simulation 02/11/2020 - Further updates of the simulation setup due to performance imbalance between parent and child. Child domains were extended and now cover more measurements (discussed with Module B).
Nested simulation 03/11/2020 13/11/2020 Bugs and outdated data in building surface parameters fixed/updated.
Nested simulation 13/11/2020 19/11/2020 Reading restart data takes more time than the actual simulation when using MPI shared memory method. Debugging started. For now, switching back to old restart mechanism.
Nested simulation 27/11/2020 03/12/2020 Error/missing input data for bio-meteorology module created crash during simulation. Bugs are temporarily fixed.
Nested simulation 30/11/2020 07/12/2020 Error in photolysis scheme: Module B detected that there is only a simple scheme available without accounting for shadowing. Due to extensive re-coding required to account for this, the simulation is continued without 3d photolysis
Nested simulation 02/12/2020 07/12/2020 Building setup needs further update: air conditioning for office buildings constructed before 2000 must be turned off to give more realistic waste-heat fluxes in Berlin. Building parameters were updated accordingly.
Nested simulation 11/12/2020 - Simulation reached 04:05 UTC (1st day)
Nested simulation 08/01/2020 27/01/2021 Simulation reached 12:44 UTC. Simulation paused for intermediate check of output quantities. Virtual measurements checked by module B (positions and completeness). Resumed simulation on 27/01/2021.
Nested simulation 11/02/2020 - Simulation reached 11:00 UTC (2nd day). Simulation is finished.
Nested simulation 12/02/2020 ongoing Output is prepared for analysis and shipment to other project partners.

Run 03 (VALM03): Winter 2017 Stuttgart, Feb 14 06:00 UTC - Feb 16 06:00 UTC

Status message: hold

Description Date of issue Closing date Further remarks
Preparing input files - -

Run 04 (VALM04): Summer 2018 Stuttgart, Jul 08 04:00 UTC - Jul 09 19:00 UTC

Status message: preparing

Description Date of issue Closing date Further remarks
Preparing input files 28/01/2021 Missing input data are acquired by different project partners.
Setup creation 11/02/2021 - First discussion round to define simulation domain and first details of setup.

Run 05 (VALM05): Hamburg, Wind tunnel

Status message: completed

Description Date of issue Closing date
Production run 18/04/2019 29/04/2019

Run 06 (VALM06): Summer 2017 Berlin, Jul 30 06:00 UTC - Aug 01 06:00 UTC

Status message: unscheduled

Last modified 11 days ago Last modified on Feb 19, 2021 12:27:34 PM

Attachments (3)

  | Impressum | ©Leibniz Universität Hannover |