Changes between Version 45 and Version 46 of doc/app/palmrun


Ignore:
Timestamp:
Nov 20, 2018 5:12:16 PM (6 years ago)
Author:
scharf
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • doc/app/palmrun

    v45 v46  
    210210**Note:** The first option {{{-b}}} is required to tell {{{palmrun}}} to create a batch job running on the local computer!
    211211
    212 Before entering the above command, you need to add information to your configuration file. You may edit an existing file (.e.g. {{{.palm.config.default}}}) or create a new one (e.g. by copying the default file to e.g. {{{.palm.config.batch}}} and then editing the new file). In general, you cannot use the same configuration file for running interactive jobs and batch jobs as well since different settings are required. Let's assume here that you have created a new file {{{.palm.config.batch}}}. Edit this file and add those batch directives required by your batch system. You can find more details in the complete description of the [wiki:doc/app/palm_config configuration file].
     212Before entering the above command, you need to add information to your configuration file. You may edit an existing file (.e.g. {{{.palm.config.default}}}) or create a new one (e.g. by copying the default file to e.g. {{{.palm.config.batch}}} and then editing the new file). In general, you cannot use the same configuration file for running interactive jobs and batch jobs as well since different settings are required. Let's assume here that you have created a new file {{{.palm.config.batch}}}. Edit this file and add those batch directives required by your batch system. You can find more details in the complete description of the [wiki:doc/app/palm_config#Batchjobdirectives configuration file].
    213213
    214214Now you may start your first batch job by entering
     
    254254If you like to use this {{{palmrun}}} feature, you need additional/special settings in the configuration file. Furthermore, you need to pre-compile the PALM-code for the remote machine using the {{{palmbuild}}} command. The automatic PALM installer can not be used to install PALM on that machine. You need to do most of the settings manually.
    255255
    256 Furthermore, passwordless ssh/scp access is required from the local computer to the remote computer, as well as from the remote to the local computer. In remote mode, {{{palmrun}}} and {{{palmbuild}}} are heavily using ssh and scp commands, and if you have not established passwordless access, you would need to enter your password several times before the batch job is finally submitted. Moreover, the job protocol file and any output files cannot be transferred back to your local computer because there is no connection to the job which could be used to provide passwords for these transfers (and even if you could, your job may require your input during nighttime while you are sleeping).
     256Furthermore, [wiki:doc/install/passwordless passwordless ssh/scp access] is required from the local computer to the remote computer, as well as from the remote to the local computer. In remote mode, {{{palmrun}}} and {{{palmbuild}}} are heavily using ssh and scp commands, and if you have not established passwordless access, you would need to enter your password several times before the batch job is finally submitted. Moreover, the job protocol file and any output files cannot be transferred back to your local computer because there is no connection to the job which could be used to provide passwords for these transfers (and even if you could, your job may require your input during nighttime while you are sleeping).
    257257
    258258Now, let's start with the configuration file settings for remote batch jobs. For this it would be convenient to create a new configuration file based on the one you already used locally, e.g. by
     
    260260   cp  .palm.config.batch  .palm.config.<remote configuration identifier>
    261261}}}
    262 where {{{<remote configuration identifier>}}} can be any string to identify your remote host. Edit this file and set at minimum the following additional variables:
    263 {{{
    264 %remote_jobcatalog   /home/username/job_queue
    265 %remote_ip           123.45.6.7
    266 %remote_username     your_username_on_the_remote_system
    267 }}}
    268 After the batch directives (lines that start with {{{BD:}}}) put another set of batch directives starting with {{{BDT:}}} that are required to generate a small additional batch job which does no more than transferring the job protocol back to your local system. Since the job protocol file generated by the main job (which is started by {{{palmrun}}}) is not available before the end of that job, the main job has to start another small job at its end, which only task is to send back the job protocol to the local host. The computing centers normally have special queues for these kind of small jobs, and you should request the job resources respectively. Here is an example for a CRAY-XC40 system:
    269 {{{
    270 # BATCH-directives for batch jobs used to send back the jobfile from a remote to a local host
    271 BDT:#!/bin/bash
    272 BDT:#PBS -N job_protocol_transfer
    273 BDT:#PBS -l walltime=00:30:00
    274 BDT:#PBS -l nodes=1:ppn=1
    275 BDT:#PBS -o {{job_transfer_protocol_file}}
    276 BDT:#PBS -j oe
    277 BDT:#PBS -q dataq
    278 }}}
    279 Only few resources are requested (e.g. 30 minutes cpu time and one core) and the job is running in a special queue {{{dataq}}}. You may need to adjust these settings with respect to your batch system.
    280 
    281 Additional settings for batch jobs on remote hosts can be found in the [wiki:doc/app/palmconfig complete description of the configuration file].
     262where {{{<remote configuration identifier>}}} can be any string to identify your remote host. Edit this file as described [wiki:doc/app/palm_config#Additionaldirectivesforbatchjobsonremotehosts here].
    282263
    283264After setting up the configuration file and before calling {{{palmrun}}}, you need to call the {{{palmbuild}}} command to generate the PALM executable for the remote host:
     
    293274After confirming the {{{palmrun}}} settings by entering {{{y}}}, similar information as for local batch jobs will be output to the terminal. {{{palmrun}}} finally terminates with messsage {{{--> palmrun finished}}}. The batch job is now queued on the remote system. After the job has been finished, the job protocol will be transferred back to your local computer and put into the folder defined by {{{local_jobcatalog}}}. If this file does not appear, because e.g. the transfer failed, you may find the protocol file on the remote host in the folder defined by {{{remote_jobcatalog}}}. Like in case of batch jobs running on local computers, check the contents of this file carefully. Beside some additional information, it mainly contains the output of the {{{palmrun}}} command as you get it during interactive execution, and especially you get information about where to find the output files on your local computer.
    294275
    295 The configuration file {{{.palm.iofiles}}} offers special controls for copying INPUT/OUTPUT files, since large PALM-setups (those using large number of grid points) can produce extremely large output files which would require long time for transferring them to your local system and which might have sizes that exceed the capacity of your local discs. See chapter [wiki:doc/palm_iofiles INPUT/OUTPUT files] which explains how to control copying of INPUT/OUTPUT files.
     276**Note:** Since large PALM-setups (those using large number of grid points) can produce extremely large output files which would require long time for transferring them to your local system and which might have sizes that exceed the capacity of your local discs. See chapter [wiki:doc/app/palm_iofiles I/O file connection configuration] which explains how to control copying of INPUT/OUTPUT files.
    296277
    297278