Changes between Version 9 and Version 10 of doc/install/advanced
- Timestamp:
- Sep 5, 2018 3:09:10 PM (7 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
doc/install/advanced
v9 v10 7 7 '''Before you start, please check if you have fulfilled all [wiki:doc/install installation requirements!]''' 8 8 9 === [=#package_installation]First step: Package installation === 9 The [wiki:doc/install/automatic automatic installer] normally cares for steps 1-5 that are described below. Failure of the automatic installation process is usually caused by inconsistencies in your software environment (e.g. mismatches between your compiler, NetCDF- and MPI- libraries) which will also cause failure of the manual installation. Anyhow, at least parts of the installation steps may be required to be carried out manually. For example, if your system has a very strict firewall and does not allow downloads from our repository, you may carry out the download (second step below) on a different system and copy the {{trunk}}} folder to your target system, before carrying on with the [wiki:doc/install/automatic automatic installer]. 10 11 Installation and configuration for batch jobs cannot be done by the [wiki:doc/install/automatic automatic installer] and requires manual work in any case, as described further below. 12 13 == [=#package_installation]First step: Package installation == 10 14 11 15 The '''first installation step''' requires creating a set of directories on the local and, for the advanced method, on the remote host. These are: … … 32 36 33 37 34 == = [=#package_configuration]Package configuration ===38 == [=#package_configuration]Third step: Package configuration == 35 39 36 40 Compilation and execution of PALM is mainly controlled by two shell scripts named [wiki:doc/app/palmbuild {{{palmbuild}}}] and [wiki:doc/app/palmrun {{{palmrun}] that are part of the download and reside in folder [[[.../trunk/SCRIPTS]]]. To use these scripts, you need to extend your {{{PATH}}}-variable, by adding a line … … 59 63 60 64 61 == = [=#package_configuration]Compiling the PALM sources ===65 == [=#package_configuration]Fourth step: Compiling the PALM sources == 62 66 63 67 … … 73 77 74 78 75 == = [=#verification]Installation verification ===79 == [=#verification]Fifth step: Installation verification == 76 80 77 As a last step, after the compilation has been finished, the PALM installation has to be verified. For this purpose a simple test run is carried out. This once again requires the '''mrun''' [[wiki:doc/app/configexample|configuration file]], as well as the [[wiki:doc/app/par|parameter file]]. The parameter file must be copied from the PALM working copy by81 As a last step, after the compilation has been finished, the PALM installation has to be verified. For this purpose a simple test run needs to be started using the script {{{palmrun}}}. In addition to the configuration file, {{{palmrun}}} requires a [wiki:doc/app/par parameter file] as well. The parameter file for the test case is provided as part of the download and needs to be copied first: 78 82 {{{ 83 cd ~/palm/current_version 79 84 mkdir -p JOBS/example_cbl/INPUT 80 85 cp trunk/INSTALL/example_cbl_p3d JOBS/example_cbl/INPUT/example_cbl_p3d 81 86 }}} 82 The test run can now be started by executing the command 87 Here, the string {{{example_cbl}}} acts as the so-called ''run identifier''. 88 The test run can now be started by entering 83 89 {{{ 84 mrun -d example_cbl -h lccrayh -K parallel -X 8 -T 8 -t 500 -q mpp1testq -r "d3# pr#"90 palmrun -d example_cbl -h default -X4 -a "d3#" 85 91 }}} 86 This specific run will be carried out on 8 PEs and is allowed to use up to 500 seconds CPU time. After pressing <return>, the most important settings of the job are displayed at the terminal window and the user is prompted for o.k. ("{{{y}}}"). Next, a message of the queuing system like "''Request … Submitted to queue… by…''" should be displayed. Now the job is queued and either started immediately or at a later time, depending on the current workload of the remote host. Provided that it is executed immediately and that all things work as designed, the job protocol of this run will appear under the file name {{{~/job_queue/lccrayh_example}}} no more than a few minutes later. The content of this file should be carefully examined for any error messages.\\\\ 87 Beside the job protocol and according to the configuration file and arguments given for 'mrun' options {{{-d}}} and {{{-r}}}, further files should be found in the directories 92 See the [wiki:doc/app/palmrun palmrun description] for detailed explanations of available options. This specific run will be carried out on 4 cores (if available on your system, others you may need to adjust the {{{-X}}} option). Most important settings of this run are displayed at the terminal window and you are prompted for o.k. ("{{{y}}}") to continue. Informations about the progress of the simulation will be output to the terminal. After {{{palmrun}}} has finished, you should find some result files in folder {{{JOBS/example_cbl/MONITORING}}}. Please compare the contents of file 88 93 {{{ 89 ~/palm/current_version/JOBS/example_cbl/MONITORING 94 ~/palm/current_version/JOBS/example_cbl/MONITORING/example_cbl_rc 90 95 }}} 91 and96 with those of the example result file that is provided under {{{trunk/INSTALL/example_cbl_rc}}}, e.g. by using the standard {{{diff}}} command 92 97 {{{ 93 ~/palm/current_version/JOBS/example_cbl/OUTPUT 98 cd ~/palm/current_version 99 diff JOBS/example_cbl/MONITORING/example_cbl_rc trunk/INSTALL/example_cbl_rc 94 100 }}} 95 Please compare the contents of file96 {{{97 ~/palm/current_version/JOBS/example_cbl/MONITORING/lccrayh_example_rc98 }}}99 with those of the example result file which can be found under {{{trunk/INSTALL/example_cbl_rc}}}, e.g. by using the standard {{{diff}}} command100 {{{101 diff JOBS/example_cbl/MONITORING/lccrayh_example_cbl_rc trunk/INSTALL/example_cbl_rc102 }}}103 where it is assumed that your working directory is {{{~/palm/current_version}}}.\\\\104 101 '''You should not find any difference between these two files''', except for the run date and time displayed at the top of the file header. If the file contents are identical, the installation is successfully completed.\\\\ 105 102 106 103 107 104 105 == Installation for running PALM in batch mode == 106 107 === Installation for batch jobs on the local machine === 108 109 Running PALM in batch mode on your local computer (requires that the computer where you are logged in has a batch system running) requires to add appropriate batch directives to the configuration file as well as settings for variables like {{{local_jobcatalog}}}, {{{defaultqueue}}}, {{{memory}}}, and {{{submit_command}}}. Settings for {{{module_commands}}} and {{{login_init_cmd}}} may be needed too. See the [wiki:doc/app/palm_config configuration file description] for further details. In order to run PALM in batch mode, the installation process is the same as described above, but {{{palmrun}}} requires additional options and may look like this 110 {{{ 111 palmrun -d example_cbl -h default -X4 -T4 -t200 -m1000 -a "d3#" -q testqueue -b 112 }}} 113 The {{{-b}}} option is essential to tell {{{palmrun}}} to generate and submit a batch job. Otherwise, it will try to execute PALM interactively in your terminal session. Again, result files for verifying the installation can be found in folder {{{JOBS/example_cbl/MONITORING}}}, after the batch job has been executed. The protocol file of the batch job, which is typically created by every batch system, can be found in the folder that has been set by {{{local_jobcatalog}}} under the name {{{<configuration identifier>_<run identifier>}}}, which is {{{default_example_cbl}}} in the given example. Further informations about running PALM in batch mode on local machines can be found in the [wiki:doc/app/palmrun_quickstart palmrun quickstart guide]. 114 115 116 === Batch jobs on a remote machine === 117 118 Follow the installation steps described above. In addition to the settings for a local batch job, installation of PALM for running batch jobs on remote machines requires further additional entries in the configuration file, at least variables {{{remote_ip}}}, {{{remote_username}}}, and {{{remote_jobcatalog}}} need to be set. For further informations see the [wiki:doc/app/palmrun_quickstart palmrun quickstart guide] and the [wiki:doc/app/palmrun palmrun documentation]. Assuming a configuration file {{{.palm.config.remote_system}}}, compiling the PALM sources via 119 {{{ 120 palmbuild -h remote_system 121 }}} 122 will copy the PALM sources by {{{scp}}} from your local computer to the remote system and invokes the remote compiler using {{{ssh}}}. The binaries will be put in folder {{{$HOME/palm/current_version/MAKE_DEPOSITORY_remote_system}}} on the remote system. 123 124 For using {{{palmrun}}}, additional batch directives have to be added to the configuration file in order to transfer back the job protocol file (see the [wiki:doc/app/palmrun_quickstart palmrun quickstart guide] for further details). The {{{palmrun}}} command for generating the test run then reads 125 {{{ 126 palmrun -d example_cbl -h remote_system -X4 -T4 -t200 -m1000 -a "d3#" -q testqueue 127 }}} 128 {{{palmrun}}} transfers back the result file via {{{scp}}} and you should find it on your local system in folder {{{JOBS/example_cbl/MONITORING}}} under the name {{{remote_system_example_cbl_rc}}} after the job on the remote system has finished. The job protocol file will also be copied to the folder that has been set by {{{local_jobcatalog}}}. 129 130 Using {{{palmbuild}}} and {{{palmrun}}} for installing and running PALM on remote machines requires passwordless login via {{{scp}}} and {{{ssh}}}, as descrobed in the next section. 108 131 109 132 === Passwordless login via ssh === 110 133 111 All hosts (local as well as remote) are accessed via the secure shell (ssh). The user must establish passwordless login using the [[wiki:/doc/install/passwordless|private/public-key mechanism]] (HLRNIII users please see [[wiki:/doc/app/machine/hlrnIII|hints]]). '''To ensure proper function of mrun, passwordless login must be established in both directions, from the local to the remote host as well as from the remote to the local host! '''Test this by carrying out e.g. on the local host:134 All hosts (local as well as remote) are accessed via the secure shell (ssh). The user must establish passwordless login using the [[wiki:/doc/install/passwordless|private/public-key mechanism]] (HLRNIII users please see [[wiki:/doc/app/machine/hlrnIII|hints]]). '''To ensure proper function of {{{palmbuild}}} and {{{palmrun}}}, passwordless login must be established in '''both directions''', from the local to the remote host as well as from the remote to the local host! '''Test this by carrying out e.g. on the local host: 112 135 {{{ 113 136 ssh <username on remote host>@<remote IP-address> … … 117 140 ssh <username on local host>@<local IP-address> 118 141 }}} 119 In both cases you should not be prompted for a password. '''Before continuing the further installation process, this mustbe absolutely guaranteed! '''It must also be guaranteed for '''all''' other remote hosts, on which PALM shall run.\\\\120 Please note that on many remote hosts, passwordless login must also work '''within the remote host''', i.e. for ssh connections from the remote host to itself. Test this by executing on the remote host:142 In both cases you should not be prompted for a password. '''Before starting with the installation process, this should be absolutely guaranteed! '''It must also be guaranteed for '''all''' other remote hosts, on which PALM shall run.\\\\ 143 Please note that on many remote hosts, passwordless login must also work '''within the remote host''', i.e. for {{{ssh}}} connections from the remote host to itself (e.g. for connections from compute nodes to login nodes). Test this by executing on the remote host: 121 144 {{{ 122 145 ssh <username on remote host>@<remote IP-address> … … 124 147 You should not be prompted for a password.\\\\ 125 148 126 === [=#other_machines]Configuration for other machines ===127 128 Starting from version 3.2a, beside the default hosts (HLRN, etc.), PALM can also be installed and run on other Linux-Cluster-, IBM-AIX, or NEC-SX-systems. To configure PALM for a non-default host only requires to add some lines to the configuration file {{{.mrun.config}}}.\\\\129 First, you have to define the host identifier (a string of arbitrary length) under which your local host shall be identified by adding a line130 {{{131 %host_identifier <hostname> <host identifier>132 }}}133 to the configuration file (best to do this in the section where the other default host identifiers are defined). Here {{{<hostname>}}} must be the name of your local host as provided by the unix-command "{{{hostname}}}". The first characters of {{{<host identifier>}}} have to be "{{{lc}}}", if your system is (part of) a linux-cluster, "{{{ibm}}}", or "{{{nec}}}" in case of an IBM-AIX- or NEC-SX-system, respectively. For example, if you want to install on a linux-cluster, the line may read as134 {{{135 %host_identifier foo lc_bar136 }}}137 In the second step, you have to give all informations neccessary to compile and run PALM on your local host by adding an additional section to the configuration file:138 {{{139 %remote_username <1> <host identifier> parallel140 %tmp_user_catalog <2> <host identifier> parallel141 %compiler_name <3> <host identifier> parallel142 %compiler_name_ser <4> <host identifier> parallel143 %cpp_options <5> <host identifier> parallel144 %netcdf_inc <6> <host identifier> parallel145 %netcdf_lib <7> <host identifier> parallel146 %fopts <8> <host identifier> parallel147 %lopts <9> <host identifier> parallel148 }}}149 The section consists of four columns each separated by one or more blanks. The first column gives the name of the respective environment variable used by '''mrun''' and '''mbuild''', while the second column defines its value. The third column has to be the host identifier as defined above, and the last column in each line must contain the string "{{{parallel}}}". Otherwise, the respective line(s) will be interpreted as belonging to the setup for compiling and running a serial (non-parallel) version of PALM.\\\\150 All brackets have to be replaced by the appropriate settings for your local host:151 152 * {{{<1>}}} is the username on your LOCAL host153 * {{{<2>}}} is the temporary directory in which PALM runs will be carried out154 * {{{<3>}}} is the compiler name which generates parallel code155 * {{{<4>}}} is the compiler name for generating serial code156 * {{{<5>}}} are the preprocessor options to be invoked. In most of the cases, it will be neccessary to adjust the MPI data types to double precision by giving {{{-DMPI_REAL=MPI_DOUBLE_PRECISION -DMPI_2REAL=MPI_2DOUBLE_PRECISION}}}. To switch on the netCDF support, you also have to give {{{-D__netcdf}}} and {{{-D__netcdf4}}} (if you like to have netCDF4/HDF5 data format; this requires a netCDF4 library!).157 * {{{<6>}}} is the compiler option for specifying the include path to search for the netCDF module/include files158 * {{{<7>}}} are the linker options to search for the netCDF library159 * {{{<8>}}} are the general compiler options to be used. You should allways switch on double precision (e.g. {{{-r8}}}) and code optimization (e.g. {{{-O2}}}).160 * {{{<9>}}} are the linker options161 * {{{<host identifier>}}} is the host identifier as defined before162 163 A typical example may be:164 {{{165 %remote_username raasch lc_bar parallel166 %tmp_user_catalog /tmp lc_bar parallel167 %compiler_name mpif90 lc_bar parallel168 %compiler_name_ser ifort lc_bar parallel169 %cpp_options -DMPI_REAL=MPI_DOUBLE_PRECISION:-DMPI_2REAL=MPI_2DOUBLE_PRECISION:-D__netcdf lc_bar parallel170 %netcdf_inc -I:/usr/local/netcdf/include lc_bar parallel171 %netcdf_lib -L/usr/local/netcdf/lib:-lnetcdf lc_bar parallel172 %fopts -axW:-cpp:-openmp:-r8:-nbs lc_bar parallel173 %lopts -axW:-cpp:-openmp:-r8:-nbs:-Vaxlib lc_bar parallel174 }}}175 Currently (version 3.7a), depending on the MPI version which is running on your local host, the options for the execution command (which may be {{{mpirun}}} or {{{mpiexec}}}) may have to be adjusted manually in the '''mrun'''-script. A future version will allow to give the respective settings in the configuration file.\\\\176 If you have any problems with the PALM installation, the members of the PALM working group are pleased to help you.\\\\\\177 149 178 150 179 = [=#update]Installation of new / other versions, version update =151 == [=#update]Installation of new / other revisions, revision update = 180 152 181 The PALM group announces code revisions by emails send to the PALM mailing list. If you like to be put on this list, just send an email to raasch@muk.uni-hannover.de. Details about new releases can be found in the [../tec/changelog PALM change log].\\\\182 Generally, there are two ways of installing new / other versions. You can install a version from the list of available PALM releases or you can update your current installation with the newest developer version of PALM.\\\\183 If you have previously checked out the most recent (at that time) PALM developer version by using153 All code revisions are documented under [wiki:doc/tec/changelog]. The PALM group announces major code revisions via the PALM mailing list. You will be automatically set on the list by creating an account using the [[//trac/register|register form]].\\\\ 154 Generally, there are two ways of installing new / other versions. You can install a version from the [wiki:doc/tec/releasenotes list of available PALM releases] or you can update your current installation with the newest developer revision of PALM.\\\\ 155 If you have previously checked out the PALM developer revison by using 184 156 {{{ 185 157 svn checkout ...../palm/trunk trunk 186 158 }}} 187 you can easily make an update to the newest version by changing into the working directory {{{~/palm/current_version}}} and executing159 you can easily make an update to the newest revision by 188 160 {{{ 161 cd ~/palm/current_version 189 162 svn update trunk 190 163 }}} 191 This updates all files in the PALM working copy in subdirectory {{{trunk}}}. The update may fail due the '''subversion''' rules, if you have modified the contents of trunk. In case of any conflicts with the repository, please refer to the '''subversion''' documentation on how to remove them. In order to avoid such conflicts, modifications of the default PALM code should be omitted and be restricted to the user-interface only (see [../app/userint here]).\\\\164 This updates all files in the working copy in folder {{{trunk}}} (which is your working copy of the PALM repository). The update may fail due the '''subversion''' rules, if you have modified the contents of trunk. In case of any conflicts with the repository, please refer to the '''subversion''' documentation on how to remove them. In order to avoid such conflicts, modifications of the default PALM code should be omitted and be restricted to the user-interface only (see [../app/userint here]), except you are a PALM developer.\\\\ 192 165 Alternatively, you can install new or other releases in a different directory, eg. 193 166 {{{ 194 mkdir ~/palm/release- 3.1c195 cd ~/palm/release- 3.1c196 svn checkout --username <your username> https://palm.muk.uni-hannover.de/svn/palm/tags/release- 3.1ctrunk167 mkdir ~/palm/release-4.0 168 cd ~/palm/release-4.0 169 svn checkout --username <your username> https://palm.muk.uni-hannover.de/svn/palm/tags/release-4.0 trunk 197 170 }}} 198 However, this would require to carry out again the complete installation process described above. So far, different versions of PALM cannot be used at the same time. The PALM releases from {{{palm/tags}}} never have to be updated with "{{{svn update}}}", since these releases are frozen! \\\\199 After updating the working copy, please check for any differences between your current configuration file ({{{.mrun.config}}}) and the default configuration files under {{{trunk/SCRIPTS/.mrun.config.<compiler>}}} and adjust your current file, if neccessary.\\\\ 200 The scripts and the pre-compiled codemust then be updated via171 However, this requires to carry out again the complete installation process described above. So far, different versions of PALM cannot be used at the same time. The PALM releases from {{{palm/tags}}} never have to be updated with "{{{svn update}}}", since these versions are frozen! \\\\ 172 173 The compiled PALM code and helper routines must then be updated via 201 174 {{{ 202 mbuild -u -h lcmuk 203 mbuild -u -h ibmh 204 mbuild -h ibmh 175 palmbuild -h default 205 176 }}} 206 or via 177 or for any other configuration files that you are using.\\\\ 178 You can use '''subversion''' for code comparison between the different revisions. Also, modified code can be committed to the repository, but this is restricted to PALM developers.\\\\ 179 180 If you want to recompile PALM via {{{palmbuild}}} after you have modified the configuration file (e.g. if you changed compiler options or switched to other libraries), you need to apply the {{{touch}}} command on all source files in advance: 207 181 {{{ 208 mbuild -u 209 mbuild 182 touch trunk/SOURCE/* 210 183 }}} 211 on all remote hosts listed in the configuration file {{{.mrun.config}}}.\\\\ 212 You can use '''subversion''' for code comparison between the different versions. Also, modified code can be committed to the repository, but this is restricted to PALM developers.\\\\ 184 because otherwise the {{{make}}} mechanism will not detect any source file that needs to be compiled. As an alternative, instead of ''touching'' the files, you may delete the {{{MAKE_DEPOSITORY}}} folder before calling {{{palmbuild}}}, but then the complete code will be re-compiled. 213 185 214 If you want to recompile PALM via {{{mbuild}}} after you have modified the configuration file {{{.mrun.config}}} (e.g. if you switch to a newer compiler or NetCDF version), you will have to perform the touch command on all source files: 215 {{{ 216 touch trunk/SOURCE/* . 217 }}} 218 because otherwise the {{{make}}} mechanism will not be able to recompile the code. 219 220 As a last step, a suitable test run should be carried out. It should be carefully examined whether and how the results created by the new version differ from those of the old version. Possible discrepancies which go beyond the ones announced in the [../tec/changelog PALM change log] should be communicated as soon as possible to the PALM group. 186 As a last step, a suitable test run should be carried out. It should be carefully examined whether and how the results created by the new revision differ from those of the old version. Possible discrepancies which go beyond the ones announced in the [wiki:doc/tec/changelog PALM change log] should be communicated as soon as possible via our [/newticket ticket system].