Changes between Version 1 and Version 2 of doc/app/machine/nec_aurora_imuk


Ignore:
Timestamp:
Jan 20, 2020 12:45:20 PM (6 years ago)
Author:
raasch
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • doc/app/machine/nec_aurora_imuk

    v1 v2  
    1111{{{qsub -I -l nodes=1:ppn=1,mem=20gb -W x=PARTITION:muk}}}
    1212
    13 By default, the interactive job sessions runs for 24h. Your $HOME-filesystem of the LUIS cluster is mounted on the NEC-System. For storing large output files, use the NEC-local filesystem {{{/scratch/<your LUIS username>}}}
     13By default, the interactive job sessions runs for 24h. Your $HOME-filesystem of the LUIS cluster is mounted on the NEC-System. For storing large output files, use the NEC-local filesystem {{{/scratch/<your LUIS username>}}}.
    1414
    1515
     16== How to install PALM on the NEC SX-Aurora
    1617
    17 On the NEC-machine only specific software to compile and run PALM is available (NEC Fortran compiler, NetCDF-,
     18On the NEC-machine only specific software to compile and run PALM is available (NEC Fortran compiler, NetCDF-, FFTW-, and MPI-libraries). Software modules provided by LUIS are not accessible. Following steps describe the PALM installation:
    1819
    19 Furthermore, edit file {{{$HOME/.bash_profile}}} and add the following lines:
     20 1. Login on the LUIS cluster system, create {{{~/palm/current_version}}} and checkout PALM from the svn-repository. You need to do that on the LUIS login-node, because svn is not installed on the BEC-Aurora.
     21 2. Add line
    2022{{{
    21 # User specific environment and startup programs
    22 
    23 PALM_BIN=$HOME/palm/current_version/trunk/SCRIPTS
    24 export PALM_BIN
    25 
    26 PATH=$PALM_BIN:$PATH
    27 export PATH
    28 
    29 LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/netcdf/4.1.1/lib
    30 export LD_LIBRARY_PATH
     23export PATH=/opt/nec/ve/bin:$PATH
     24}}}
     25to file {{{.profile}}} in $HOME.
     26 3. Copy the NEC-Aurora configuration file to your working directory, e.g.
     27{{{
     28cd ~/palm/current_version
     29cp trunk/SCRIPTS/.palm.config.aurora .
     30}}}
     31 4. Edit this file and replace strings in angular brackets (<>) with your specific settings.
     32 5. Start an interactive job on the NEC-system (see above) and compile PALM:
     33{{{
     34qsub .....
     35palmbuild -c aurora
    3136}}}
    3237
    33 == tatara-system [http://www2.cc.kyushu-u.ac.jp/scp/system/general/CX/how_to_use] ==
    3438
    35 The system is supported starting from revision r1097. You can find a configuration file adjusted for this system in the svn-repository under [source:/palm/trunk/SCRIPTS/.mrun.config.tatara /palm/trunk/SCRIPTS/.mrun.config.tatara]. Just copy this file to your working directory
     39== Running PALM on the NEC-Aurora
     40
     41The NEC machine has 8 so-called vector engines (numbered 0-7). Each engine is designed to efficiently execute 8 MPI tasks. You can start more than 8 MPI tasks on a vector engine, but this will slow down the performance drastically. Each vector engine provides 48 GByte main memory. The {{{/scratch}}} file system to store restart- and other big output files has a capacity of about 21 TByte.
     42
     43So far, you can run PALM in interactive mode only. Before calling {{{palmrun}}}, you need to specify the specific vector engines that you like to use in the execute-command line of your configuration file. As an example
    3644{{{
    37 cd ~/palm/current_version
    38 cp trunk/SCRIPTS/.mrun.config.tatara .mrun.config
     45%execute_command     mpirun -v -ve 0-3  -np {{mpi_tasks}}  ./palm
    3946}}}
    40 Don't forget to edit the file in order to replace the string {{{<replace by your tatara username>}}} with your respective username.
     47will start the MPI tasks on vector engines {{{0-3}}}. With this setting, you can use up to 32 MPI tasks ({{{-X32}}} in the {{{palmrun}}} coammand). In case of  {{{-X64}}} you need to adjust the execute-command line to {{{-ve 0-7}}}.
    4148
    42 For the rest of the installation procedure just follow the instructions given on the [wiki:doc/install installation page].
     49In case that several people like to use the machine simultaneously, you should arrange with the other users how to split the vector engines among each other.
    4350
    44 == hayaka-system [http://www2.cc.kyushu-u.ac.jp/scp/system/general/FX10/how_to_use] ==
    4551
    46 This system is supported starting from r1103 The configuration file adjusted for this system can be found under [source:/palm/trunk/SCRIPTS/.mrun.config.hayaka /palm/trunk/SCRIPTS/.mrun.config.hayaka]. Copy this file to your working directory
    47 {{{
    48 cd ~/palm/current_version
    49 cp trunk/SCRIPTS/.mrun.config.hayaka .mrun.config
    50 }}}
    51 Don't forget to edit the file in order to replace the string {{{<replace by your hayaka username>}}} with your respective username.
     52== Miscellaneous
    5253
    53 === Restrictions on hayaka ===
    54 
    55 The hayaka login node only provides a cross compiler ({{{frtpx}}}) for the compute nodes. Therefore, the utility programs like {{{interpret_config}}} and {{{check_parameter_files}}} can be compiled, but they cannot be run. This requires to set appropriate {{{mrun}}} options
    56 {{{
    57 mrun ... -z -S ...
    58 }}}
    59 in order to switch off the parameter check ({{{-z}}}) and to interpret the configuration file directly from the script ({{{-S}}}).
    60 
    61 For compiling some parts of the PALM code, the Fujitsu compiler (frtpx) requires more memory than given by default. Please increase the maximum virtual memory size by command
    62 {{{
    63 ulimit -v 8097152
    64 }}}
    65 For convenience, you should add this command to the file {{{$HOME/.bash_profile}}}.
    66 
    67 No automatic restarts are possible on the hayaka system. You have to submit restart runs manually after the previous job has finished. In the job protocol file (see {{{~/job_queue}}}) of the previous run information is given about which '''mrun''' options are to be used for the restart.
     54* Please don't litter the {{{/scratch}}} file system. Remove your output data as soon as possible.
     55* In case that two people or more likes to use the machine for longer production runs, we need to discuss with the LUIS administrators how to inlcude the specific Aurora ressources (vector engines) in the SLURM directives, which would be a requirement to submit batch jobs on the machine.
     56* trunk/SCRIPTS also contains a configuration file for debugging ({{{.palm.config.aurora_debug}}}).