General hints for using the NEC SX-Aurora TSUBASA A300-8 system at IMUK

The system is part of the LUIS-cluster at LUH.

You need a user-account for the cluster-system to access the NEC-system. The login procesure is as follows:

  1. Login to the LUIS cluster-system from a workstation at IMUK:
    ssh -X <your LUIS username>@login.cluster.uni-hannover.de
    
    Ask Siggi if you don't have an account for the LUIS-cluster.
  1. From the LUIS system, you need to start an interactive job on the NEC-machine:
    qsub -I -l nodes=1:ppn=1,mem=20gb -W x=PARTITION:muk
    

By default, the interactive job sessions runs for 24h. Your $HOME-filesystem of the LUIS cluster is mounted on the NEC-System. For storing large output files, use the NEC-local filesystem /scratch/<your LUIS username>.

How to install PALM on the NEC SX-Aurora

On the NEC-machine only specific software to compile and run PALM is available (NEC Fortran compiler, NetCDF-, FFTW-, and MPI-libraries). Software modules provided by LUIS are not accessible. Following steps describe the PALM installation:

  1. Login on the LUIS cluster system, create ~/palm/current_version and checkout PALM from the svn-repository. You need to do that on the LUIS login-node, because svn is not installed on the BEC-Aurora. Please install r4370 or later, because earlier versions are not well vectorized for the NEC and will show a very poor performance.
  1. Add line
    export PATH=/opt/nec/ve/bin:$PATH
    
    to file .profile in $HOME.
  1. Copy the NEC-Aurora configuration file to your working directory, e.g.
    cd ~/palm/current_version
    cp trunk/SCRIPTS/.palm.config.aurora .
    
  1. Edit this file and replace strings in angular brackets (<>) with your specific settings.
  1. Start an interactive job on the NEC-system (see above) and compile PALM:
    qsub .....
    palmbuild -c aurora
    

Running PALM on the NEC-Aurora

The NEC machine has 8 so-called vector engines (numbered 0-7). Each engine is designed to efficiently execute 8 MPI tasks. You can start more than 8 MPI tasks on a vector engine, but this will slow down the performance drastically. Each vector engine provides 48 GByte main memory. The /scratch file system to store restart- and other big output files has a capacity of about 21 TByte.

So far, you can run PALM in interactive mode only. Before calling palmrun, you need to specify the specific vector engines that you like to use in the execute-command line of your configuration file. As an example

%execute_command     mpirun -v -ve 0-3  -np {{mpi_tasks}}  ./palm

will start the MPI tasks on vector engines 0-3. With this setting, you can use up to 32 MPI tasks (-X32 in the palmrun coammand). In case of -X64 you need to adjust the execute-command line to -ve 0-7.

In case that several people like to use the machine simultaneously, you should arrange with the other users how to split the vector engines among each other.

Important: Namelist-parameter netcdf_data_format must be set >= 5! Otherwise, PALM execution will abort.

Miscellaneous

  • Please don't litter the /scratch file system. Remove your output data as soon as possible.
  • In case that two people or more likes to use the machine for longer production runs, we need to discuss with the LUIS administrators how to inlcude the specific Aurora ressources (vector engines) in the SLURM directives, which would be a requirement to submit batch jobs on the machine.
  • trunk/SCRIPTS also contains a configuration file for debugging (.palm.config.aurora_debug).
  • Currently, only the dynamical core of PALM is well vectorized for the NEC-Aurora. Further optimization for the advection routines and other PALM modules will follow soon (last update Jan 2020).
Last modified 5 years ago Last modified on Jan 27, 2020 7:32:18 AM