Version 6 (modified by raasch, 7 years ago) (diff) |
---|
This page is under construction!
Configuring and running PALM with palmbuild and palmrun
Changes compared to mrun/build
- The new scripts will run on any kind of Linux / Unix system without requiring any adjustments. All settings are controlled via configuration files.
- mbuild is replaced by palmbuild, and mrun is replaced by palmrun. The old script subjob is not used any more (submitting jobs is now part of palmrun).
- The old configuration file .mrun.config has been split into two files .palm.config.<configuration_identifier> and .palm.iofiles, where <configuration_identifier> is an arbitrary string that you can define. "Configuration" means a setting for a specific computer with a specific compiler, compiler options, libraries, etc. If you like to run palm with different configurations, e.g. one with debug options switched on, and one with high optimization, you need to create separate configuration files for each configuration, e.g. .palm.config.optimized and .palm.config.debug. This replaces the old block structure in .mrun.config. The configuration file to be used is defined by palmrun- or palmbuild-option -h., e.g. palmrun ... -h optimized will use .palm.config.optimized
You will need only one file .palm.iofiles which contains the file connection statements to be used for all configurations.
The utility program interpret_config has been removed. The configuration files are now directly interpreted by the shellscripts.
- Only one call of palmbuild is required to compile for both the utilities and the PALM source code (there is no option -u anymore). The compiled routines (object files and executables) are put into folder MAKE_DEPOSITORY_<configuration_identifier>, where <configuration_identifier> is replaced by the string given with palmbuild-option -h.
- palmrun does not compile any more at the beginning of a batch job. The palm-executable for the batch-job (or for the interactive session) is created as part of the palmrun-call that you have manually entered at your terminal, and it is created before the batch-job is submitted. The executable is put into the folder SOURCE_FOR_RUN_<run_identifier>, where <run_identifier> is the string provided with palmrun-option -d. This folder is now put into the folder set with variable fast_io_catalog (see below for fast_io_catalog). If you do not use a user-interface, palmrun will not compile at all and will take the executable from folder MAKE_DEPOSITORY_<configuration_identifier> that has been generated with your last call of palmbuild. If palmrun cannot find the folder MAKE_DEPOSITORY_<configuration_identifier>, it will internally call palmbuild in order to generate it. If palmrun finds a folder SOURCE_FOR_RUN_<run_identifier> that has been generated by a previous call of palmbuild, it will ask you if executables from that folder shall be used. This way, you can avoid to re-compile your user-interface with each call of palmrun. Automatically generated restart runs will always use executables from SOURCE_FOR_RUN_<run_identifier>.
You may have to remove folders SOURCES_FOR_RUN_... manually frm time to time, because they are not deleted automatically at the end of a job (or the last job of a restart job chain).
- The .palm.config.<ci> file does not contain blocks any more. Several variable names have been changed (e.g. compiler_options instead of fopts) and new variables have been introduced (e.g. execute_command in order to give the command for starting the executable). Colons (:) for separating e.g. compiler options must not be used any more. Here is an example (with some lines truncated, as displayed by ....)
#$Id$ #column 1 column 2 #name of variable value of variable (~ must not be used, except for base_data) #------------------------------------------------------------------------------ %base_data ~/palm/current_version/JOBS %base_directory $HOME/palm/current_version %source_path $HOME/palm/current_version/trunk/SOURCE %user_source_path $base_directory/JOBS/$fname/USER_CODE %fast_io_catalog /localdata/your_linux_username # %local_ip 111.11.111.111 %local_username your_linux_username # %compiler_name mpif90 %compiler_name_ser ifort %cpp_options -cpp -D__parallel -DMPI_REAL=MPI_DOUBLE_PRECISION -DMPI_2REAL=MPI_2DOUBLE_PRECISION -D__fftw -D__netcdf %make_options -j 4 %compiler_options -openmp -fpe0 -O3 -xHost -fp-model source -ftz -fno-alias -ip -nbs -I /muksoft/packages/fftw/3.3.4/include -L/muksoft/.... %linker_options -openmp -fpe0 -O3 -xHost -fp-model source -ftz -fno-alias -ip -nbs -I /muksoft/packages/fftw/3.3.4/include -L/muksoft/.... %hostfile auto %execute_command mpiexec -machinefile hostfile -n {{MPI_TASKS}} ./palm
- Some further comments concerning specific variables:
- fast_io_catalog replaces the old variables tmp_user_catalog and tmp_data_catalog. It should be a folder on a file system with fast discs, as typically provided on large computer systems for temporary I/O, e.g. something like /work/.... The temporary working catalog created by palmrun will be in this folder, and your restart data should be put in this folder too. The default .palm.iofiles is using fast_io_catalog for the restart files.
- For cpp_options, you now have to give ALL switches required, especially -D__parallel to use the parallel version of PALM, which was implicitly set with mrun-option -K parallel before. The -K option has been removed.
- The compiler- and linker-options now require to give ALL include- and library-paths for the libraries that you intend to use (e.g. MPI, NetCDF, FFTW), if they are not automatically set by a module-environment (like e.g. on Cray-systems). Old variables like netcdf_inc or netcdf_lib have been removed from the configuration file.
- execute_coammand is required to define the command to execute PALM. It will depend on the MPI-library that you are using. The wildcard {{MPI_TASKS}} will be replaced by the value provided with palmrun-option -X. A further wildcard that can be used is {{TASKS_PER_NODE}} , which will be replaced by the value provided with palmrun-option -T.
- The variable write_binary (formerly used to switch on the output of restart data) has been removed from the configuration file. Output of restart data is now switched on with the activation string "restart", i.e. palmrun ..... -a "... restart".
- For running PALM on a remote host in batch, additional settings are required in the configuration file. The following is an example for using the Cray-XC40 of HLRN as a remote host:
#column 1 column 2 #name of variable value of variable (~ must not be used) #---------------------------------------------------------------------------- %base_data ~/palm/current_version/JOBS %base_directory $HOME/palm/current_version %source_path $HOME/palm/current_version/trunk/SOURCE %user_source_path $base_directory/JOBS/$fname/USER_CODE %fast_io_catalog /gfs2/work/niksiraa %local_jobcatalog /home/raasch/job_queue %remote_jobcatalog /home/h/niksiraa/job_queue # %local_ip 130.75.105.103 %local_username raasch %remote_ip 130.75.4.1 %remote_username niksiraa %remote_loginnode hlogin1 %ssh_key id_rsa_hlrn %defaultqueue mpp2testq %submit_command /opt/moab/default/bin/msub -E # %compiler_name ftn %compiler_name_ser ftn %cpp_options -e Z -DMPI_REAL=MPI_DOUBLE_PRECISION -DMPI_2REAL=MPI_2DOUBLE_PRECISION -D__parallel -D__netcdf -D__netcdf4 -D__netcdf4_parallel -D__fftw %make_options -j 4 %compiler_options -em -O3 -hnoomp -hnoacc -hfp3 -hdynamic %linker_options -em -O3 -hnoomp -hnoacc -hfp3 -hdynamic -dynamic %execute_command aprun -n {{MPI_TASKS}} -N {{TASKS_PER_NODE}} palm %memory 2300 %module_commands module load fftw cray-hdf5-parallel cray-netcdf-hdf5parallel %login_init_cmd module switch craype-ivybridge craype-haswell # # BATCH-directives to be used for batch jobs. If $-characters are required, hide them with \\\ BD:#!/bin/bash BD:#PBS -A {{PROJECT_ACCOUNT}} BD:#PBS -N {{JOB_ID}} BD:#PBS -l walltime={{CPU_HOURS}}:{{CPU_MINUTES}}:{{CPU_SECONDS}} BD:#PBS -l nodes={{NODES}}:ppn={{TASKS_PER_NODE}} BD:#PBS -o {{JOBFILE}} BD:#PBS -j oe BD:#PBS -q {{QUEUE}} # # BATCH-directives for batch jobs used to send back the jobfile from a remote to a local host BDT:#!/bin/bash BDT:#PBS -A {{PROJECT_ACCOUNT}} BDT:#PBS -N job_protocol_transfer BDT:#PBS -l walltime=00:30:00 BDT:#PBS -l nodes=1:ppn=1 BDT:#PBS -o {{JOB_TRANSFER_PROTOCOL_FILE}} BDT:#PBS -j oe BDT:#PBS -q dataq # #---------------------------------------------------------------------------- # INPUT-commands, executed before running PALM - lines must start with "IC:" #---------------------------------------------------------------------------- IC:export ATP_ENABLED=1 IC:export MPICH_GNI_BTE_MULTI_CHANNEL=disabled IC:ulimit -s unlimited
- Some additional settings are required here:
- fast_io_catalog is the one to be used on the remote host.
- IP-addresses and user names have to be given for the local AND the remote host. Usually, the remote host IP-address is the one for the login-node.
- remote_loginnode: on many of the large computer systems, the compute nodes do not allow for ssh- or scp-commands in order to transfer data to the local host or to start restart jobs. If remote_loginnode is set, palmrun tries to start these commands via the login-node. Attention: In most cases, the systems to not accept an IP-address. You have to give the mnemonic name of the login-node.
- ssh_key: here you can give the filename of a special ssh-key for using ssh / scp without password. The key must be in folder ~/.ssh. This is a special setting for the HLRN-system and should not be required on other systems.
- default_queue: if you do not set the queue via palmrun-option -q, this queue will be taken as the default queue. Other than mrun, palmrun does not check for valid queue names any more.
- To
Package installation
The