5.0
Installation of the
model
This chapter
describes the installation of PALM on a Linux workstation (local host)
and a suitable remote computer, on which the
model runs are to be carried out. The local host is used to
start batch jobs with mrun and to analyze the
results
which are produced by the model on the remote host and send back to the
local host. Alternatively, mrun
can also be used to start PALM on the local host in interactive mode or
as a batch job (if a queueing system like NQS, PBS, or LoadLeveler is
available).Requirements
The
installation and operation of PALM requires at mimimum (on both, the
local and the remote host, unless stated otherwise):
- The Korn-shell (AT&T ksh or public
domain ksh) must be
available under /bin/ksh.
- The
NetCDF-library with version number not earlier than 3.6.0-p1 (for
NetCDF, see under www.unidata.ucar.edu).
- A FORTRAN90/95 compiler.
- The Message Passing Interface (MPI), at
least on the remote host, if the parallel version of PALM shall be used.
- On the local host, the revision control
system subversion
(see subversion.tigris.org).
This is already included in many Linux distributions (e.g. SuSe). subversion requires port 3690 to
be open for tcp/udp. If
there are firewall restrictions concerning this port, the PALM code
cannot be accessed. The
user needs a permit to access the PALM repository. For getting a permit
please contact the PALM group (raasch@muk.uni-hannover.de)
and define a username under which you like to access the repository.
You will then receive a password which allows the access under this
name.
- A
job queueing system must be available on the remote host. Currently, mrun can handle
LoadLeveler (IBM-AIX) and NQS/PBS (Linux-Clusters, NEC-SX).
- ssh/scp-connections to and from the remote
host must not be blocked by a firewall.
Currently, mrun is configured
to be used on a limited number of selected machines. These are SGI-ICE
systems at computing center HLRN in Hannover (lcsgih), Berlin (lcsgib),
IBM-Regatta system
at Yonsei University, Seoul (ibms),
on NEC-SX6/8 systems at DKRZ, Hamburg (nech) and RIAM,
Kyushu University, Fukuoka (necriam),
as well as on the Linux cluster of IMUK (lcmuk), Tokyo
Institute of Technology (lctit),
and the Bergen Center for Computational Science (lcxt4).
The strings given in brackets are the systems names (host identifiers)
under which mrun
identifies the
different hosts.
You can also use mrun/PALM on other
Linux-Cluster, IBM-AIX, or NEC-SX machines. See below on
how to configure mrun
for other machines. However, these configurations currently (version
3.2a) allow to run PALM in interactive mode only.
The
examples given in this chapter refer to an
installation of PALM on an IMUK Linux workstation and the SGI-ICE
system of
the HLRN used as remote host. They are just called local and
remote host from now on.
The installation
process requires a valid
account on the local and on the remote host as well.
All hosts (local as well as remote) are
accessed via the secure shell (ssh). The user must establish
passwordless login using the private/public-key mechanism (see e.g. the
HLRN
documentation). To ensure proper function of mrun,
passwordless login must be
established in both directions, from the local to the remote host as
well as from the remote to the local host! Test this by
carrying
out e.g. on the local host:
ssh
<username on remote host>@<remote
IP-address>
and on the remote host:
ssh
<username on local host>@<local IP-adddress>
In both cases you should not be
prompted for a password. Before continuing the further
installation
process, this must be absolutely guaranteed! This must also
be
guaranteed for all other remote hosts, on which
PALM shall run.
Please
note that on many remote hosts, passwordless login must also be
established within the remote host, i.e. from the
remote host to itself. Test this by executing on the remote host: ssh
<remote IP-address>. You should not be prompted
for a password.
Package
Installation
In
the first installation step a
set of directories must be created both on the local and on the
remote host. These directories are:
~/job_queue
~/palm
~/palm/current_version
~/palm/current_version/JOBS
The names of these directories
are
freely selectable (except ~/job_queue),
however new users should use them as suggested, since many
examples in this documentation as well as all example files are
assuming these settings. The directory ~/palm/current_version
on the local host will be called the working directory from now on.
In
the second
step
a working copy of the recent version of the PALM software package,
including the source code, scripts, documentation, etc. must
be
copied to the working directory (local
host!) by executing the following
commands. Replace <your username> by the name that you
chose to
access the repository, and <#> by any of the available
PALM
releases, e.g. "3.1c"
(new releases will be anounced by email to the PALM mailing list).
cd
~/palm/current_version
svn
checkout --username <your username>
svn://130.75.105.2/palm/tags/release-<#> trunk
You
will then be prompted for your password. After finishing, the
subdirectory trunk should
appear in your working directory. It contains a number of further
subdirectories which contain e.g. the PALM source code (SOURCE)
and the scripts for running PALM (SCRIPTS).
Alternatively, executing
svn checkout --username <your username> svn://130.75.105.2/palm/tags/release-<#> abcde
will place your working copy in a
directory named abcde instead
of a directory named trunk.
But keep in mind that you will have to adjust several paths given
below, if you do not use the default directory trunk.
Please never touch any file in
your working copy of PALM, unless you know exactly what you
are doing.
You can also get a copy of the
most recent code by executing
svn checkout --username <your username> svn://130.75.105.2/palm/trunk trunk
However,
this version may contain bugs and new features may not be documented. In future PALM releases,
repository access to this most recent version will
probably be restricted to the PALM developers.Package
Configuration
To
use the PALM scripts, the PATH-variable
has to be extended and the
environment variable
PALM_BIN has to be set (on local and remote host)
in the respective profile of the users default shell (e.g. in .profile,
if
ksh is used):
export
PATH=$HOME/palm/current_version/trunk/SCRIPTS:$PATH
export
PALM_BIN=$HOME/palm/current_version/trunk/SCRIPTS
You
may have to login again in order to activate these settings.
On the local and on the remote host, some
small helper/utility programs have to be installed, which
are later used by mrun e.g.
for PALM data postprocessing. The installation is done by mbuild. This script
requires a configuration file
.mrun.config, which will be also used by mrun in the
following. A copy has to be put into the working directory under the
name
.mrun.config bycp
trunk/SCRIPTS/.mrun.config.default .mrun.config
Beside many other things, this file contains
typical installation parameters
like compiler name, compiler options, etc.
for a set of different (remote) hosts. Please edit this file, uncomment
lines like#%remote_username
<replace by your ... username>
<host identifier>
by
removing the first hash (#)
character and replace the string "<replace
by ...>" by your username on the respective host
given in the <host
identifier>.
You only have to uncomment lines for those hosts on which you intend to
use PALM.
Warning:
When editing the configuration file, please NEVER use the TAB key.
Otherwise, very confusing errors in mrun execution may occur.
Beside the default configuration file .mrun.config.default, the directory
trunk/SCRIPTS contains additional configuration files
which are already adjusted for special hosts:
.mrun.config.imuk can be used at Hannover University,
.mrun.config.riam can
be used at the Research Institute of Applied Mechanics, Kyushu
University. These files have to be edited in the same way as described
above.
After modifying the configuration file, the
respective executables are generated by executing
mbuild -u -h lcmuk
mbuild -u -h
lcsgih
The
second call also copies the PALM scripts (like mrun and mbuild) to the
remote
host.
Pre-Compilation
of PALM Code
To avoid the
re-compilation of the complete source code for each model run, PALM
willl be pre-compiled once on the remote host by again using the script
mbuild. Due
to the use of
FORTRAN modules in the source code, the subroutines must be compiled
in a certain order. Therefore the so-called make
mechanism
is used (see the respective man-page of the Unix operating system),
requiring a
Makefile,
in which the dependencies are described. This file is found in
subdirectory trunk/SOURCE, where
also the PALM code is stored. The compiled
sources (object
files) are
stored on the remote computer in the default directory
~/palm/current_version/MAKE_DEPOSITORY.The
pre-compilation for the remote host (here the SGI-ICE system of HLRN)
is
done by
mbuild
-h lcsgih
mbuild
will prompt some queries,
which must all be
answered "y" by the user. The compiling process will take some time. mbuild transfers
the respective compiler calls to the remote
host where they are carried out interactively. You can follow the
progress at the terminal window, where also error messages
are displayed (hopefully not for this standard installation). By just
entering
mbuild
PALM
will
be (consecutively) pre-compiled for all remote hosts listed in
the configuration file. If you want to compile for the local host only,
please enter
mbuild
-h lcmuk
Installation Verification
As a last step,
after the compilation has been finished, the PALM installation has to
be verified. For this
purpose a simple test run is carried out. This once again requires the mrun
configuration file (described in chapter
3.2), as well
as the parameter
file
(described in chapter
4.4.1). The
parameter file must be
copied from the PALM working copy by
mkdir -p JOBS/example_cbl/INPUT
cp
trunk/INSTALL/example_cbl_p3d JOBS/example_cbl/INPUT/example_cbl_p3d
The
test run can
now be started by executing the command
mrun -d example_cbl -h lcsgih -K parallel -X 8 -T 8 -t 500 -q testq -r “d3# pr#”
This specific run
will be carried out on 8 PEs and is allowed to use up to 500 seconds
CPU time. After pressing <return>, the most important
settings of
the job are displayed at the terminal window
and the user is prompted for o.k. (“y”).
Next, a message of the queuing system like “Request
…
Submitted to queue… by…” should
be displayed. Now the job is
queued and either started immediately or at a later time, depending on
the
current workload of the remote host. Provided that it is executed
immediately and that all things work as designed, the job protocol of
this run will appear under the file name ~/job_queue/lcsgih_example no
more than a few minutes later. The content of this
file should be carefully examined for any error messages.
Beside the job
protocol and according to
the configuration file and arguments given for mrun
options
-d and -r,further
files should be found in
the
directories
~/palm/current_version/JOBS/example_cbl/MONITORING
and
~/palm/current_version/JOBS/example_cbl/OUTPUT
Please compare the
contents of file
~/palm/current_version/JOBS/example_cbl/MONITORING/lcsgih_example_rc
with those of the
example result file which can be found under
trunk/INSTALL/example_cbl_rc., e.g. by using the standard
diff command:
diff
JOBS/example_cbl/MONITORING/lcsgih_example_cbl_rc
trunk/INSTALL/example_cbl_rc
where
it is assumed that your working directory is
~/palm/current_version.
You should not find any
difference between these two files, except of the run date
and time displayed at the top of the file header. If
the file contents are identical, the installation is successfully
completed.
Configuration
for other machines
Starting
from version 3.2a, beside the default hosts (HLRN, etc.), PALM can also
be installed and run on other Linux-Cluster-, IBM-AIX, or
NEC-SX-systems. To configure PALM for a non-default host only requires
to add some lines to the configuration file
.mrun.config.
First,
you have to define the host identifier (a string of arbitrary length)
under which your local host shall be identified by adding a line
%host_identifier
<hostname> <host
identifier>
to the
configuration file (best to do this in the section where the other
default host identifiers are defined). Here
<hostname> must be the name of your local
host as provided by the unix-command "hostname".
The first characters of
<host identifier> have to be "lc",
if your system is (part of) a linux-cluster, "ibm",
or "nec"
in case of an IBM-AIX- or NEC-SX-system, respectively. For example, if
you want to install on a linux-cluster, the line may read as
%host_identifier foo
lc_bar
In
the second step, you have to give all informations neccessary to
compile and run PALM on your local host by adding an additional section
to the configuration file:
%remote_username
<1> <host
identifier> parallel
%tmp_user_catalog
<2> <host
identifier> parallel
%compiler_name
<3> <host
identifier> parallel
%compiler_name_ser
<4> <host
identifier> parallel
%cpp_options
<5> <host
identifier> parallel
%netcdf_inc
<6> <host
identifier> parallel
%netcdf_lib
<7> <host
identifier> parallel
%fopts
<8> <host
identifier> parallel
%lopts
<9>
<host identifier> parallel
The
section consists of four columns each separated by one or more blanks.
The first column gives the name of the respective environment variable
used by mrun
and mbuild,
while the second column defines its value. The third column has to be
the host identifier as defined above, and the last column in each line
must contain the string "parallel".
Otherwise, the respective line(s) will be interpreted as belonging to
the setup for compiling and running a serial (non-parallel) version of
PALM.
All brackets have to be replaced by the
appropriate settings for your local host:
- <1>
is the username on your LOCAL host
- <2>
is the temporary directory in which PALM runs will be
carried out
- <3>
is the compiler name which generates parallel code
- <4>
is the compiler name for generating serial code
- <5>
are
the preprocessor options to be invoked. In most of the cases, it will
be neccessary to adjust the MPI data types to double precision by
giving -DMPI_REAL=MPI_DOUBLE_PRECISION
-DMPI_2REAL=MPI_2DOUBLE_PRECISION. To switch on the NetCDF
support, you also have to give -D__netcdf
and -D__netcdf4
(if you like to have NetCDF4/HDF5 data format; this requires a NetCDF4 library!).
- <6>
is the compiler option for specifying the include path to
search for the NetCDF module/include files
- <7>
are the linker options to search for the NetCDF library
- <8>
are the general compiler options to be used. You should
allways switch on double precision (e.g. -r8)
and code optimization (e.g. -O2).
- <9>
are the linker options
- <host
identifier> is the host identifier as defined
before
A typical example may be:
%remote_username
raasch
lc_bar parallel
%tmp_user_catalog
/tmp lc_bar
parallel
%compiler_name
mpif90 lc_bar
parallel
%compiler_name_ser
ifort lc_bar
parallel
%cpp_options
-DMPI_REAL=MPI_DOUBLE_PRECISION:-DMPI_2REAL=MPI_2DOUBLE_PRECISION:-D__netcdf lc_bar parallel
%netcdf_inc
-I:/usr/local/netcdf/include
lc_bar parallel
%netcdf_lib
-L/usr/local/netcdf/lib:-lnetcdf
lc_bar parallel
%fopts
-axW:-cpp:-openmp:-r8:-nbs
lc_bar parallel
%lopts
-axW:-cpp:-openmp:-r8:-nbs:-Vaxlib lc_bar
parallel
Currently (version 3.7a),
depending on the MPI
version which is running on your local host, the options for the
execution command (which may be mpirun
or mpiexec)
may have to be adjusted manually in the mrun-script. A future version
will allow to give the respective settings in the configuration file.
If you have any problems
with the PALM
installation, the members of the PALM working group are pleased to
help you.
Last
change: $Id: chapter_5.0.html 287 2009-04-09
08:59:36Z raasch $