Changes between Version 28 and Version 29 of doc/app/palm_config


Ignore:
Timestamp:
Nov 22, 2018 9:44:36 AM (6 years ago)
Author:
kanani
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • doc/app/palm_config

    v28 v29  
    5555* {{{OC:}}} defines unix commands that are executed by {{{palmrun}}} just after the PALM code has stopped. For example, you may inform yourself about termination of the program by sending an email:
    5656{{{
    57 OC:echo "PALM simulation $run_identifier has finished" | mail  username@email-address
     57OC:echo "PALM simulation $run_identifier has finished" | mailx  username@email-address
    5858}}}
    5959
     
    102102|| [=#us_so user_source_path]  || Path to the [wiki:doc/app/userint user interface routines]. The variable {{{run_identifier}}} that may be used in the default path is replaced by the argument given with {{{palmrun}}}-option {{{-r}}}.
    103103
    104 
    105 
    106 
    107 
    108104You may add further variables to this list, which might e.g. be required for batch directives (see below).
    109105
     
    111107=== Batch job directives ===
    112108
    113 If you like {{{palmrun}}} to start PALM in batch mode, you need to add those batch directives to the configuration file that are required by your specific batch system. Add the string {{{BD:}}} at the beginning of each directive. Because of a large variety of batch systems with different syntax, and because many computer centers further modify the directives, we can only give a general example here, which is for an OpenPBS based batch system used on a Cray-XC40 at HLRN (http://www.hlrn.de).
     109If you want to run PALM in [wiki:doc/app/palmru#batch batch mode], you need to add those batch directives to the configuration file that are required by your specific batch system. Add the string {{{BD:}}} at the beginning of each directive. Because of a large variety of batch systems with different syntax, and because many computer centers further modify the directives, we can only give a general example here, which is for an OpenPBS-based batch system used on a Cray-XC40 at HLRN (http://www.hlrn.de).
    114110
    115111Batch directives required for this system read
     
    117113BD:#!/bin/bash
    118114BD:#PBS -A {{project_account}}
    119 BD:#PBS -N {{job_id}}
     115BD:#PBS -N {{run_id}}
    120116BD:#PBS -l walltime={{cpu_hours}}:{{cpu_minutes}}:{{cpu_seconds}}
    121117BD:#PBS -l nodes={{nodes}}:ppn={{tasks_per_node}}
     
    124120BD:#PBS -q {{queue}}
    125121}}}
    126 Strings in double curly brackets are interpreted as variables and are replaced by {{{palmrun}}} based on settings via specific {{{palmrun}}} options or settings in the environment variable section of the configurations file. From the given batch directives, {{{palmrun}}} generates a batch script (file), also called batch job, which is then submitted to the batch system using the submit command that has been defined by {{{submit_command}}} (see environment variable section above). If you like to check the generated batch script, then run {{{palmrun}}} with additional option {{{-F}}}, which will write the batch script to file {{{jobfile.#####}}} in your current working directory, where {{{#####}}} is a 5-digit random number (which is part of the so-called job-id). A batch job will not be submitted.
     122Strings in double curly brackets are interpreted as variables and are replaced by {{{palmrun}}} based on settings via specific {{{palmrun}}} options or settings in the environment variable section of the configuration file. From the given batch directives, {{{palmrun}}} generates a batch script (file), also called batch job, which is then submitted to the batch system using the submit command that has been defined by the variable [#su_co submit_command]. If you like to check the generated batch script, then run {{{palmrun}}} with additional option {{{-F}}}, which will write the batch script to file {{{jobfile.#####}}} in your current working directory, where {{{#####}}} is a 5-digit random number (which is part of the so-called {{{run_id}}}). A batch job will not be submitted.
    127123
    128124In addition to the batch directives, the configuration file requires further information to be set for using the batch system, which is done by adding / modifying variable assignments. A minimum set of variables to be added / modified:
     
    140136Given values are just examples! The automatic installer may have already included these variable settings as comment lines (starting with {{{#}}}). Then just remove the {{{#}}} and provide a proper value.
    141137
    142 
    143138The following variables are frequently used in batch directives and recognized by {{{palmrun}}} by default:
    144139
    145 ||='''Variable name''' =||='''meaning''' =||='''value''' =||
     140||='''Variable name''' =||='''Meaning''' =||='''Value''' =||
    146141|-----------
    147 ||project_account    ||Account number under which the batch job shall run.  ||argument of {{{palmrun}}} option {{-A}}}  ||
    148 ||job_id             ||Job name under which you can find the job in the respective job queue  ||argument of {{{palmrun}}} option {{{-r}}} plus a 5-digit random number, separated by a dot, e.g. {{{palmrun -r abcde ...}}} may generate {{{abcde.12345}}}  ||
    149 ||cpu_hours \\ cpu_minutes \\ cpu_seconds \\ cputime  ||cpu time requested by the job split in hours, minutes and seconds. {{{cputime}}} is the requested time in seconds.          ||Calculated from {{{palmrun}}} option {{{-t}}}, e.g. in the above example, {{{palmrun -t 3662 ...}}} will generate {{{1:1:2}}}  ||
    150 ||timestring         ||Requested CPU time in format hh:mm:ss  ||calculated from {{{palmrun}}} option {{{-t}}}  ||
    151 ||tasks_per_node     ||Number of MPI tasks to be started on each requested node  ||as given by {{{palmrun}}} option {{{-T}}}  ||
    152 ||threads_per_task   ||Number of OpenMP threads to be started by each MPI task  ||as given by {{{palmrun}}} option {{{-O}}}  ||
    153 ||cores              ||Total number of cores requested by the job  ||as given by {{{palmrun}}} option {{{-X}}} ||
     142||cores              ||Total number of cores requested by the job   ||as given by {{{palmrun}}} option {{{-X}}} ||
     143||cpu_hours \\ cpu_minutes \\ cpu_seconds \\ cputime                ||cpu time requested by the job split in hours, minutes and seconds. {{{cputime}}} is the requested time in seconds.          ||Calculated from {{{palmrun}}} option {{{-t}}}, e.g. in the above example, {{{palmrun -t 3662 ...}}} will generate {{{1:1:2}}}       ||
     144||job_protocol_file  ||Name of the file (including path) to which the job protocol is written   ||generated from {{{palmrun}}} options {{{-c}}} and {{{-r}}} and the path set by environment variable {{{local_jobcatalog}}}. As an example, if {{{local_jobcatalog = /home/user/job_queue}}}, the call of {{{palmrun -r testrun -c mycluster ....}}} gives a job protocol file {{{home/user/job_queue/mycluster_testrun}}}.    ||
     145||memory             ||Requested memory in MByte  ||as given by {{{palmrun}}} option {{{-m}}} or as set in the configuration file via {{{%memory}}}. Option overwrites the setting in the configuration file.  ||
     146||mpi_tasks          ||Total number of MPI tasks to be started  ||calculated as {{{cores / threads_per_task}}}  ||
    154147||nodes              ||Number of compute nodes requested by the job  ||calculated from {{{palmrun}}} options {{{-X}}} (total number of cores to be used), {{{-T}}} (number of MPI tasks to be started on each node), and {{{-O}}} (number of OpenMP threads to be started by each MPI task). {{{nodes}}} is calculated as {{{cores / ( tasks_per_node * threads_per_task)}}}. {{{threads_per_task}}} is one in pure MPI applications. If {{{tasks_per_node * threads_per_task}}} is not an integral divisor of the total number of cores, less tasks/threads will run on the last node.  ||
    155 ||mpi_tasks          ||Total number of MPI tasks to be started  ||calculated as {{{cores / threads_per_task}}}  ||
    156 ||job_protocol_file  ||Name of the file (including path) to which the job protocol is written   ||generated from {{{palmrun}}} options {{{-c}}} and {{{-r}}} and the path set by environment variable {{{local_jobcatalog}}}. As an example, if {{{local_jobcatalog = /home/user/job_queue}}}, the call of {{{palmrun -r testrun -c mycluster ....}}} gives a job protocol file {{{home/user/job_queue/mycluster_testrun}}}.  ||
    157 ||memory             ||Requested memory in MByte  ||as given by {{{palmrun}}} option {{{-m}}} or as set in the configuration file via {{{%memory}}}. Option overwrites the setting in the configuration file.  ||
    158 ||queue              ||Batch queue to which the job is submitted  ||as given by {{{palmrun}}} option {{{-q}}}. If the option is omitted, a default queue defined by variable {{{default_queue}}} is used.  ||
    159 ||previous_job       ||Job-id of a previous job   ||as given with {{{palmrun}}} option {{{-W}}}. Can be used to define job dependencies. The job-id should be the one that has been assigned by the batch system to the (previous) job.  ||
     148||previous_job       ||run_id of a previous job   ||as given with {{{palmrun}}} option {{{-W}}}. Can be used to define job dependencies. The run_id should be the one that has been assigned by the batch system to the (previous) job.  ||
     149||project_account    ||Account number under which the batch job shall run.  ||argument of {{{palmrun}}} option {{{-A}}}  ||
     150||queue              ||Batch queue to which the job is submitted ||as given by {{{palmrun}}} option {{{-q}}}. If the option is omitted, a default queue defined by variable {{{default_queue}}} is used.  ||
     151||run_id             ||Job name under which you can find the job in the respective job queue ||argument of {{{palmrun}}} option {{{-r}}} plus a 5-digit random number, separated by a dot, e.g. {{{palmrun -r abcde ...}}} may generate {{{abcde.12345}}}  ||
     152||tasks_per_node     ||Number of MPI tasks to be started on each requested node ||as given by {{{palmrun}}} option {{{-T}}}  ||
     153||threads_per_task   ||Number of OpenMP threads to be started by each MPI task ||as given by {{{palmrun}}} option {{{-O}}} ||
     154||timestring         ||Requested CPU time in format hh:mm:ss ||calculated from {{{palmrun}}} option {{{-t}}}  ||
     155
    160156
    161157Instead of using variables in the batch directives, you may give specific values, but with a loss of flexibility.
     
    164160=== Additional directives for batch jobs on remote hosts===
    165161
    166 If {{{palmrun}}} is used in remote batch mode, i.e. the batch job is submitted from your local computer to a remote computer, additional batch job directives are required to guarantee that the job protocol file is sent back to your local computer after the batch job has finished on the remote system. Since the job protocol file is often only available after the job has finished, a small additional job is started at the end of the batch job, which only purpose is to transfer the job protocol from the remote to the local system. Batch directives for this job are given in the configuration file too. Add the string {{{BDT:}}} at the beginning of each directive. As for the main job directives (that start with {{{BD:}}}), we can only give a general example here, which is again for an OpenPBS based batch system.
     162If {{{palmrun}}} is used in remote batch mode, i.e. the batch job is submitted from your local computer to a remote computer, additional batch job directives are required to guarantee that the job protocol file is sent back to your local computer after the batch job has finished on the remote system. Since the job protocol file is often only available after the job has finished, a small additional job is started at the end of the batch job, which only purpose is to transfer the job protocol from the remote to the local system. Batch directives for this job are given in the configuration file too. Add the string {{{BDT:}}} at the beginning of each directive. As for the main job directives (that start with {{{BD:}}}), we can only give a general example here, which is again for an OpenPBS-based batch system.
    167163{{{
    168164BDT:#!/bin/bash