Changes between Version 14 and Version 15 of doc/app/palm_config


Ignore:
Timestamp:
Aug 16, 2018 2:39:28 PM (6 years ago)
Author:
raasch
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • doc/app/palm_config

    v14 v15  
    6060}}}
    6161
    62 * lines starting with {{{EC:}}} define unix commands that shall be executed in case that the PALM code or the {{{palmrun}}} script terminated because of any kind of error. You can restrict execution of error commands to specific kinds of error:
     62* lines starting with {{{EC:}}} define unix commands that shall be executed in case that the PALM code or the {{{palmrun}}} script terminated because of any kind of error. You can restrict execution of error commands to specific kinds of error, e.g. errors that appear during PALM execution:
    6363{{{
    64 EC:[[     ]]  &&  error-command
     64EC:[[ \$locat = execution ]]  &&  error-command
    6565}}}
     66See the {{{palmrun}}} source code for other specific locations that are used in this script.
    6667
    6768* lines starting with {{{BD:}}} define directives that are required for batch jobs, i.e. if PALM shall be run in batch mode. Explanations for batch directives are given further below.
     
    9899||linker_options    ||Compiler options to be used to link the PALM executable. Typically, these are paths to libraries used by PALM, e.g. NetCDF, FFTW, MPI, etc. You may repeat the options that you have given with {{{compiler_options}}} here. See your local system documentation / software manuals for required path settings. Requirements differ from system to system and also depend on the respective libraries that you are using. See [wiki:doc/app/recommended_compiler_options] for specific path settings that we, the PALM group, are using on our computers.  Be aware, that these settings probably will not work on your computer system.  ||no default value  ||
    99100||'''hostfile'''          ||'''Name of the hostfile that is used by MPI to determine the nodes on which the MPI processes are started.'''\\\\ {{{palmrun}}} automatically generates the hostfile if you set {{{auto}}}. All MPI processes will then be started on the node on which {{{palmrun}}} is executed. The real name of the hostfile will then be set to {{{hostfile}}} (instead of {{{auto}}}) and, depending on your local MPI implementation, you may have to give this name in the {{{execute_command}}}. MPI implementations on large computer centers often do not require to explicitly specify a hostfile (in such a case you can remove this line from the configuration file), or the batch systems provides a hostfile which name you may access via environment variables (e.g. {{{$PBS_NODEFILE}}}) and which needs to be given in the {{{execute_command}}}. Please see your local system / batch system documentation about the hostfile policy on your system.  ||no default value  ||
    100 ||execute_command   ||MPI command to start the PALM executable. \\  Please see your local MPI documentation about which command needs to be used on your system. The name of the PALM executable, usually the last argument of the execute command, must be {{{palm}}}. Typically, the command requires to give several further options like the number of MPI processes to be started, or the number of compute nodes to be used. Values of these options may change from run to run. Don't give specific values here and use variables instead which will be automatically replaced by {{{palmrun}}} with values that you have specified with respective {{{palmrun}}} options. As an example {{{aprun  -n {{mpi_tasks}}  -N {{tasks_per_node}}  palm}}} will be interpreted as {{{aprun  -n 240  -N 24  palm}}} if you call {{{palmrun ... -X240 -T24 ...}}}.  ||no default value  ||
     101||execute_command   ||MPI command to start the PALM executable. \\  Please see your local MPI documentation about which command needs to be used on your system. The name of the PALM executable, usually the last argument of the execute command, must be {{{palm}}}. Typically, the command requires to give several further options like the number of MPI processes to be started, or the number of compute nodes to be used. Values of these options may change from run to run. Don't give specific values here and use variables (written in double curly brackets) instead which will be automatically replaced by {{{palmrun}}} with values that you have specified with respective {{{palmrun}}} options. As an example {{{aprun  -n {{mpi_tasks}}  -N {{tasks_per_node}}  palm}}} will be interpreted as {{{aprun  -n 240  -N 24  palm}}} if you call {{{palmrun ... -X240 -T24 ...}}}.  ||no default value  ||
     102||memory            ||Memory request per MPI process (or CPU core) in MByte. \\ {{{palmrun}}} option{{{-m}}} overwrites this setting.  ||no default value  ||
     103||module_commands   ||Module command(s) for loading required software / libraries. \\ In case that you have a {{{modules}}} package on your system, you can specify here the command(s) to load the specific software / libraries that your PALM run requires, e.g. the compiler, the NetCDF software, the MPI library, etc. Alternatively, you can load the modules from your shell profile (e.g. {{{.bashrc}}}), but then all your PALM runs will use the same settings. An example for a Cray system to use fftw and parallel NetCDF is {{{module load fftw cray-hdf5-parallel cray-netcdf-hdf5parallel}}}. The commands are carried out at the beginning of a batch job, or before PALM is compiled with {{{palmbuild}}}. ||no default value  ||
     104||login_init_cmd    ||Special commands to be carried out at login or start of batch jobs on the remote host. \\ You may specify here a command, e.g. for setting up special system environments in batch jobs. It is carried out as first command in the batch job.  ||no default value   ||
    101105
     106You may add further variables to this list, which might e.g. be required for batch directives (see below).
    102107
    103 # memory request per core
    104 #%memory              2300
    105 
    106 # module commands to load required libraries
    107 #%module_commands     module load fftw cray-hdf5-parallel cray-netcdf-hdf5parallel
    108 
    109 # special commands to be carried out at login and start of batch jobs on the remote host
    110 #%login_init_cmd      module switch craype-ivybridge craype-haswell
    111108
    112109=== batch job directives ===
    113110
     111If you like {{{palmrun}}} to start PALM in batch mode, you need to add those batch directives to the configuration file that are required by your specific batch system. Add the string {{{BD:}}} at the beginning of each directive. Because of a large variety of batch systems with different syntax, and because many computer centers further modify the directives, we can only give a general example here, which is for an OpenPBS based batch system used on a Cray-XC40 at HLRN (http://www.hlrn.de).
     112
     113Batch directives required for this system read
     114{{{
     115BD:#!/bin/bash
     116BD:#PBS -A {{project_account}}
     117BD:#PBS -N {{job_id}}
     118BD:#PBS -l walltime={{cpu_hours}}:{{cpu_minutes}}:{{cpu_seconds}}
     119BD:#PBS -l nodes={{nodes}}:ppn={{tasks_per_node}}
     120BD:#PBS -o {{job_protocol_file}}
     121BD:#PBS -j oe
     122BD:#PBS -q {{queue}}
     123}}}
     124Strings in double curly brackets are interpreted as variables and are replaced by {{{palmrun}}} based on settings via specific {{{palmrun}}} options or settings in the environment variable section of the configurations file. From the given batch directives, {{{palmrun}}} generates a batch script (file), also called batch job, which is then submitted to the batch system. If you like to check this batch script, then run {{{palmrun}}} with additional option {{{-F}}}, which will write the batch script to file {{{jobfile.#####}}} in your current working directory, where {{{#####}}} is a 5-digit random number (which is part of the so-called job-id). A batch job will not be submitted.
     125
     126The following variables are frequently used in batch directives and recognized by {{{palmrun}}} by default:
     127
     128||='''Variable name''' =||='''meaning''' =||='''value''' =||
     129|-----------
     130||job_id   ||job name under which you can find the job in the respective job queue  ||generated from {{{palmrun}}} option {{{-d}}} plus a 5-digit random number, separated by a dot, e.g. {{{palmrun -d abcde ...}}} may generate {{{abcde.12345}}}  ||
     131||cpu_hours \\ cpu_minutes \\ cpu_seconds \\ cputime  ||cpu time requested by the job split in hours, minutes and seconds. {{{cputime}}} is the requested time in seconds. ||generated from {{{palmrun}}} {{{-t}}}, e.g. in the above example, {{{palmrun -t 3662 ...}}} will generate {{{1:1:2}}}  ||
     132||nodes     ||number of compute nodes requested by the job  ||calculated from {{{palmrun}}} options {{{-X}}} (total number of cores to be used), {{{-T}}} (number of MPI tasks to be started on each node), and {{{-O}}} (number of OpenMP threads to be started by each MPI task). {{{nodes}}} is calculated as {{{cores / ( tasks_per_node * threads_per_task)}}}. {{{threads_per_task}}} is one in pure MPI applications. If {{{tasks_per_node * threads_per_task}}} is not an integral divisor of the total number of cores, less tasks/threads will run on the last node.  ||
     133||tasks_per_node   ||.       ||.        ||
     134||threads_per_task ||.       ||.        ||
     135||cores            ||.       ||.        ||
     136||job_protocol_file  ||.     ||.        ||
     137||queue            ||.       ||.        ||
     138
     139Variable {{{project_account}}} is not known by {{{palmrun}}} by default. It needs to be defined in the configuration file by an addition entry like:
     140{{{
     141%project_account  nhbk20010
     142}}}
     143
     144Instead using variables in the batch directives, you may use specific values, but with a loss of flexibility.
     145
    114146=== UNIX commands ===
    115147