= Hints for using the Cray-XC30 at HLRN =

* Known problems
* Performance issues with runs using larger number of cores
* Running remote jobs 
* Fortran issues
* Parallel NetCDF I/O
* Output problem with combine_plot_fields
* How to use the Allinea debugger
* How to check memory usage of a run
\\


== Known problems ==

The progress bar output may cause problems (hanging or aborting jobs) in case of runs with larger core numbers. If you run into such problems, please report them immediately.


== Performance issues with runs using larger number of cores ==

* '''FFT pressure solve'''
Runs using the FFT pressure solver with core numbers > 10.000 may show substantially improved performance in case of setting the MPI environment variable {{{MPICH_GNI_MAX_EAGER_MSG_SIZE=16384}}} (the default value on the XC30 is 8192). It changes the threshold value for switching the data transfer with {{{MPI_ALLTOALL}}} from rendezvous to eager protocol.

Setting can be realized by including an additional line in the configuration file, e.g.:
{{{
IC:export MPICH_GNI_MAX_EAGER_MSG_SIZE=16384
}}}

* '''MPI one-sided communication (RMA-MPI)'''
The raytracing algorithm in the radiation model uses MPI one-sided communication (MPI-RMA) to calculate the view factors (SVF) as well as the canopy sink factors (CSF). Performance degradation may occur when using large number of cores and the model halts during the calculation of SVF/CSF. In order to return performance, Special settings for some environmental variables have to be set.
These setting can be realized by including, i.e., the following additional lines to the configuration file:
{{{
IC:export MPICH_RMA_OVER_DMAPP=1
IC:export MPICH_RMA_USE_NETWORK_AMO=1
}}}

== Running remote jobs ==

Since Tuesday, June 14 2016, login to the HLRN-III via password authentication has been disabled. You will need to use a SSH key. 
For a specific step-by-step instruction on how to establish passwordless SSH access to the HLRN-III see https://www.hlrn.de/home/view/System3/PubkeyLogin.

For a general instruction to establish passwordless SSH login between two hosts, please click [[wiki:doc/install/passwordless|here]].


== Fortran issues ==

The Cray Fortran Compiler (ftn) on HLRN-III is known to be less flexible when it comes to the Fortran code style. In the following you find known issues observed at HLRN-III.

* It is no longer allowed to use a space character between the variable name of an array (e.g. {{{mask_x_loop}}}) and the bracket "{{{(}}}".\\'''Example:'''\\{{{mask_x_loop (1,:) = 0., 500. ,50.,}}} (old)\\{{{mask_x_loop(1,:) = 0., 500. ,50.,}}} (new).

* It is no longer possible to use {{{==}}} or {{{.EQ.}}} for comparison of variables of type LOGICAL.\\'''Example:'''\\{{{IF ( variable == .TRUE. ) THEN}}} is not supported. You must use {{{IF ( variable ) THEN}}} (or {{{IF ( .NOT. variable ) THEN}}}) instead.
\\


== Parallel NetCDF I/O
* see hints given in the attachments
\\


== Output problem with combine_plot_fields ==

'''This problem is solved in revision 1270'''

The output of 2D or 3D data with PALM may cause the following error message in the job protocol:
{{{
*** post-processing: now executing "combine_plot_fields_parallel.x" ..../mrun: line 3923: 30156: Memory fault
}}}
"/mrun: line 3923:" refers to the line where combine_plot_fields is called in the mrun-script (line number may vary with script version).

Since each processor opens its own output file and writes 2D- or 3D-binary data into it, the routine combine_plot_fields combines these output files into one single file. Output format is netcdf.
The reason for this error is that combine_plot_fields is started on the Cray system managment (MOM) nodes, where the stack size is limited to 8 Mbytes. This value is exceeded e.g. if a cross-section has more than 1024 x 1024 grid points. The '''stack size should not be increased''', otherwise the system may crash (see the HLRN site for more information). To start combine_plot_fields on the computing nodes, '''aprun''' is required (so far, combine_plot_fields is not started with aprun in PALM).

For the moment we recommend to carry out the following steps:
   1. If you start the job, save the temporary directory by using the following option:
      {{{
      mrun ... -B
      }}}
   2. After the job has finished, the executable file 'combine_plot_fields_<block>.x' has to be copied from trunk/SCRIPTS/ to the temporary directory. 
      <block> is given in the .mrun.config in column five (and six), e.g. parallel. The location of the temporary directory is given by
      %tmp_user_catalog in the .mrun.config.

   3. Create a batch script which is using '''aprun''' to start the executable file, e.g. like this:
      {{{
      #!/bin/bash
      #PBS -l nodes=1:ppn=1
      #PBS -q mpp1q
      #PBS -l walltime=00:30:00
      #PBS -l partition=berlin

      cd <%tmp_user_catalog>
      aprun -n 1 -N 1 ./combine_plot_fields_<block>.x
      }}} 
      '''Attention''': Use only the batch queues mmp1q or testq, otherwise it may not be working.

   4. After running the batch script, the following files should be available in the temporary directory (depending on the chosen output during the simulation):
      DATA_2D_XY_NETCDF, DATA_2D_XZ_NETCDF, DATA_2D_YZ_NETCDF, DATA_2D_XY_AV_NETCDF, DATA_2D_XZ_AV_NETCDF and DATA_2D_YZ_AV_NETCDF. You can copy these files to the 
      standard output directory and you can rename them, e.g. DATA_2D_XY_NETCDF to <job_name>_xy.nc.

== How to use the '''allinea'''-debugger on hlogin and blogin ==

Starting from Rev 1550,  PALM allows using of the '''allinea'''-debugger on hlogin and blogin within interactive sessions.
The following gives a brief instruction how to apply the '''allinea'''-debugger:

   1. Add an additional block {{{"lccrayb parallel debug"}}} (please note that the "debug" is mandatory) in the '''mrun''' configuration file (equivalent for lccrayh) ({{{.mrun.config}}}) which has to contain a line:
{{{
      %allinea      true           lccrayb parallel debug
}}}
      Moreover, add the module ddt to the %modules flag as indicated by the following:
{{{
      %modules      ddt:fftw: ...  lccrayb parallel debug
}}}
      The program should compiled with option -g (this case ALLINEA is able to show the exact line where the error occur) as well as with option -O0 (to disable that the code might be reordered in surprising ways).

   2. Copy {{{.mrun.config}}} into directory {{{~/palm/current_version}}} on hlogin/blogin. Also copy parameter-file and other files required for the run to the respective subdirectories under {{{~/palm/current_version}}} (e.g. {{{JOBS/USERCODE...}}}).

   3. Log in on hlogin/blogin (it is essential to use "{{{-X}}}" as ssh-option !!) and execute the following commands to launch an interactive session on the computing nodes (e.g. for a debug-run with 4 cores on one node):
{{{
      msub -I -X -l nodes=1:ppn=4 -l walltime=1000 -q mpp1testq
                  # starts a so-called interactive job
      module load ddt
      module load fftw
      module load cray-hdf5-parallel
      module load cray-netcdf-hdf5parallel

      mrun -d ....
           # usual mrun-call, options as required by user, but WITHOUT option -h and WITHOUT option -b
           # values given for -X and -T options must match the msub settings,
           # e.g. in this case "-X4 -T4"
}}}
   After short time, the '''allinea'''-window should open (if mpp1testq is filled with other jobs, you may have to wait for a longer time, alternatively you can also try to run on mpp2testq).

   4. Within the '''allinea'''-window go to ''Application'' and select ''a.out'' (located within the current working-directory).  

   5. Please remove the checkmark at 'Submit to Queue' since you run an interactive job. 

   6. Now you can "RUN" '''allinea'''. Enjoy debugging.  

   7. After closing the '''allinea''' session, don't forget to leave the interactive job with "{{{exit}}}" command. If you did not use the entire requested {{{walltime}}} for debugging, you should cancel your interactive session on the computing nodes by the "{{{canceljob}}}"-command.  

The HLRN-III provides a brief online documentation for '''alinea''' (see https://www.hlrn.de/home/view/System3/AllineaDDT for details).


== How to check the memory usage of a run ==

Checking the memory usage might be interesting for large jobs, especially in case you experience problems with it.

How to:

 1. Login to login node (e.g. blogin1)
 2. ssh bxcmom0{1..4}   --> choose from 1-4
 3. module load nodehealth
 4. xtnodestat | grep <your HLRN username>
 5. pcmd -a <apid> 'ps -u nikfarah -o pid,time,cmd,rss,vsize,pmem'  --> apid is the  application process id given by the command in 4.