Home

Context Navigation

palm_config

Timestamp:: Aug 17, 2018 7:04:44 AM (7 years ago)
Author:: raasch
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

doc/app/palm_config

-                      v16
+                      v17
 ||linker_options    ||Compiler options to be used to link the PALM executable. Typically, these are paths to libraries used by PALM, e.g. NetCDF, FFTW, MPI, etc. You may repeat the options that you have given with {{{compiler_options}}} here. See your local system documentation / software manuals for required path settings. Requirements differ from system to system and also depend on the respective libraries that you are using. See [wiki:doc/app/recommended_compiler_options] for specific path settings that we, the PALM group, are using on our computers.  Be aware, that these settings probably will not work on your computer system.  ||no default value  ||
 ||'''hostfile'''          ||'''Name of the hostfile that is used by MPI to determine the nodes on which the MPI processes are started.'''\\\\ {{{palmrun}}} automatically generates the hostfile if you set {{{auto}}}. All MPI processes will then be started on the node on which {{{palmrun}}} is executed. The real name of the hostfile will then be set to {{{hostfile}}} (instead of {{{auto}}}) and, depending on your local MPI implementation, you may have to give this name in the {{{execute_command}}}. MPI implementations on large computer centers often do not require to explicitly specify a hostfile (in such a case you can remove this line from the configuration file), or the batch systems provides a hostfile which name you may access via environment variables (e.g. {{{$PBS_NODEFILE}}}) and which needs to be given in the {{{execute_command}}}. Please see your local system / batch system documentation about the hostfile policy on your system.  ||no default value  ||
 ||execute_command   ||MPI command to start the PALM executable. \\  Please see your local MPI documentation about which command needs to be used on your system. The name of the PALM executable, usually the last argument of the execute command, must be {{{palm}}}. Typically, the command requires to give several further options like the number of MPI processes to be started, or the number of compute nodes to be used. Values of these options may change from run to run. Don't give specific values here and use variables (written in double curly brackets) instead which will be automatically replaced by {{{palmrun}}} with values that you have specified with respective {{{palmrun}}} options. As an example {{{aprun  -n {{mpi_tasks}}  -N {{tasks_per_node}}  palm}}} will be interpreted as {{{aprun  -n 240  -N 24  palm}}} if you call {{{palmrun ... -X240 -T24 ...}}}.  ||no default value  ||
+||execute_command   ||MPI command to start the PALM executable. \\  Please see your local MPI documentation about which command needs to be used on your system. The name of the PALM executable, usually the last argument of the execute command, must be {{{palm}}}. Typically, the command requires to give several further options like the number of MPI processes to be started, or the number of compute nodes to be used. Values of these options may change from run to run. Don't give specific values here and use variables (written in double curly brackets) instead which will be automatically replaced by {{{palmrun}}} with values that you have specified with respective {{{palmrun}}} options. As an example {{{aprun  -n {{mpi_tasks}}  -N {{tasks_per_node}}  palm}}} will be interpreted as {{{aprun  -n 240  -N 24  palm}}} if you call {{{palmrun ... -X240 -T24 ...}}}. See the batch job section below about further variables that are recognized by {{{palmrun}}}.  ||no default value  ||
 ||memory            ||Memory request per MPI process (or CPU core) in MByte. \\ {{{palmrun}}} option{{{-m}}} overwrites this setting.  ||no default value  ||
 ||module_commands   ||Module command(s) for loading required software / libraries. \\ In case that you have a {{{modules}}} package on your system, you can specify here the command(s) to load the specific software / libraries that your PALM run requires, e.g. the compiler, the NetCDF software, the MPI library, etc. Alternatively, you can load the modules from your shell profile (e.g. {{{.bashrc}}}), but then all your PALM runs will use the same settings. An example for a Cray system to use fftw and parallel NetCDF is {{{module load fftw cray-hdf5-parallel cray-netcdf-hdf5parallel}}}. The commands are carried out at the beginning of a batch job, or before PALM is compiled with {{{palmbuild}}}. ||no default value  ||
 …
 BD:#PBS -q {{queue}}
 }}}
 Strings in double curly brackets are interpreted as variables and are replaced by {{{palmrun}}} based on settings via specific {{{palmrun}}} options or settings in the environment variable section of the configurations file. From the given batch directives, {{{palmrun}}} generates a batch script (file), also called batch job, which is then submitted to the batch system. If you like to check this batch script, then run {{{palmrun}}} with additional option {{{-F}}}, which will write the batch script to file {{{jobfile.#####}}} in your current working directory, where {{{#####}}} is a 5-digit random number (which is part of the so-called job-id). A batch job will not be submitted.
+Strings in double curly brackets are interpreted as variables and are replaced by {{{palmrun}}} based on settings via specific {{{palmrun}}} options or settings in the environment variable section of the configurations file. From the given batch directives, {{{palmrun}}} generates a batch script (file), also called batch job, which is then submitted to the batch system using the submit command that has been defined by {{{submit_command}}} (see environment variable section above). If you like to check the generated batch script, then run {{{palmrun}}} with additional option {{{-F}}}, which will write the batch script to file {{{jobfile.#####}}} in your current working directory, where {{{#####}}} is a 5-digit random number (which is part of the so-called job-id). A batch job will not be submitted.
 The following variables are frequently used in batch directives and recognized by {{{palmrun}}} by default:
 …
 ||='''Variable name''' =||='''meaning''' =||='''value''' =||
 |-----------
 ||job_id   ||job name under which you can find the job in the respective job queue  ||generated from {{{palmrun}}} option {{{-d}}} plus a 5-digit random number, separated by a dot, e.g. {{{palmrun -d abcde ...}}} may generate {{{abcde.12345}}}  ||
 ||cpu_hours \\ cpu_minutes \\ cpu_seconds \\ cputime  ||cpu time requested by the job split in hours, minutes and seconds. {{{cputime}}} is the requested time in seconds. ||calculated from {{{palmrun}}} option {{{-t}}}, e.g. in the above example, {{{palmrun -t 3662 ...}}} will generate {{{1:1:2}}}  ||
 ||timestring  ||Requested CPU time in format hh:mm:ss  ||calculated from {{{palmrun}}} option {{{-t}}}  ||
 ||nodes     ||number of compute nodes requested by the job  ||calculated from {{{palmrun}}} options {{{-X}}} (total number of cores to be used), {{{-T}}} (number of MPI tasks to be started on each node), and {{{-O}}} (number of OpenMP threads to be started by each MPI task). {{{nodes}}} is calculated as {{{cores / ( tasks_per_node * threads_per_task)}}}. {{{threads_per_task}}} is one in pure MPI applications. If {{{tasks_per_node * threads_per_task}}} is not an integral divisor of the total number of cores, less tasks/threads will run on the last node.  ||
 ||tasks_per_node   ||.       ||.        ||
 ||threads_per_task ||.       ||.        ||
 ||cores            ||.       ||.        ||
 ||mpi_tasks        ||.       ||cores / threads_per_task  ||
 ||job_protocol_file  ||.     ||.        ||
 ||memory           ||requested memory in MByte  ||as given by {{{palmrun}}} option {{{-m}}} or as set in the configuration file via {{{%memory}}}. Option overwrites the setting in the configuration file.  ||
 ||queue            ||.       ||.        ||
+||job_id             ||Job name under which you can find the job in the respective job queue  ||generated from {{{palmrun}}} option {{{-d}}} plus a 5-digit random number, separated by a dot, e.g. {{{palmrun -d abcde ...}}} may generate {{{abcde.12345}}}  ||
+||cpu_hours \\ cpu_minutes \\ cpu_seconds \\ cputime  ||cpu time requested by the job split in hours, minutes and seconds. {{{cputime}}} is the requested time in seconds.          ||Calculated from {{{palmrun}}} option {{{-t}}}, e.g. in the above example, {{{palmrun -t 3662 ...}}} will generate {{{1:1:2}}}  ||
+||timestring         ||Requested CPU time in format hh:mm:ss  ||calculated from {{{palmrun}}} option {{{-t}}}  ||
+||tasks_per_node     ||Number of MPI tasks to be started on each requested node  ||as given by {{{palmrun}}} option {{{-T}}}  ||
+||threads_per_task   ||Number of OpenMP threads to be started by each MPI task  ||as given by {{{palmrun}}} option {{{-O}}}  ||
+||cores              ||Total number of cores requested by the job  ||as given by {{{palmrun}}} option {{{-X}}} ||
+||nodes              ||Number of compute nodes requested by the job  ||calculated from {{{palmrun}}} options {{{-X}}} (total number of cores to be used), {{{-T}}} (number of MPI tasks to be started on each node), and {{{-O}}} (number of OpenMP threads to be started by each MPI task). {{{nodes}}} is calculated as {{{cores / ( tasks_per_node * threads_per_task)}}}. {{{threads_per_task}}} is one in pure MPI applications. If {{{tasks_per_node * threads_per_task}}} is not an integral divisor of the total number of cores, less tasks/threads will run on the last node.  ||
+||mpi_tasks          ||Total number of MPI tasks to be started  ||calculated as {{{cores / threads_per_task}}}  ||
+||job_protocol_file  ||Name of the file (including path) to which the job protocol is written   ||generated from {{{palmrun}}} options {{{-h}}} and {{{-d}}} and the path set by environment variable {{{local_jobcatalog}}}. As an example, if {{{local_jobcatalog = /home/user/job_queue}}}, the call of {{{palmrun -d testrun -h mycluster ....}}} gives a job protocol file {{{home/user/job_queue/mycluster_testrun}}}.  ||
+||memory             ||Requested memory in MByte  ||as given by {{{palmrun}}} option {{{-m}}} or as set in the configuration file via {{{%memory}}}. Option overwrites the setting in the configuration file.  ||
+||queue              ||Batch queue to which the job is submitted  ||as given by {{{palmrun}}} option {{{-q}}}. If the option is omitted, a default queue defined by variable {{{default_queue}}} is used.  ||
 Variable {{{project_account}}} is not known by {{{palmrun}}} by default. It needs to be defined in the configuration file by an addition entry like:
 …
 Instead of using variables in the batch directives, you may give specific values, but with a loss of flexibility.
+=== additional directives for batch jobs on remote hosts===
+If {{{palmrun}}} is used in remote batch mode, i.e. the batch job is submitted from your local computer to a remote computer, additional batch job directives are required to guarantee that the job protocol file is sent back to your local computer after the batch job has finished on the remote system. Since the job protocol file is often only available after the job has finished, a small additional job is started at the end of the batch job, which only purpose is to transfer the job protocol from the remote to the local system. Batch directives for this job are given in the configuration file too. Add the string {{{BDT:}}} at the beginning of each directive. As for the main job directives (that start with {{{BD:}}}), we can only give a general example here, which is again for an OpenPBS based batch system.
+{{{
+BDT:#!/bin/bash
+BDT:#PBS -A {{project_account}}
+BDT:#PBS -N job_protocol_transfer
+BDT:#PBS -l walltime=00:30:00
+BDT:#PBS -l nodes=1:ppn=1
+BDT:#PBS -o {{job_transfer_protocol_file}}
+BDT:#PBS -j oe
+BDT:#PBS -q dataq
+}}}
+Keep in mind to request only few resources because this job just carries out a file transfer via scp. Computing centers often offer a special queue for these kind of jobs ({{{dataq}}} in the above example). The variable {{{job_transfer_protocol_file}}} is determined by {{{palmrun}}}. In case that you did not receive the job protocol, you may look into the protocol file of this transfer job. You can/should find this file under the name {{{last_job_transfer_protocol}}} on the remote host in the directory defined by {{{remote_jobcatalog}}}. A new job overwrites the transfer protocol of a previous job.
 === UNIX commands ===