Changes between Version 3 and Version 4 of doc/tec/parallel


Ignore:
Timestamp:
Jul 4, 2016 8:06:51 PM (8 years ago)
Author:
Giersch
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • doc/tec/parallel

    v3 v4  
    1414The scaling behavior of PALM 4.0 is presented in Fig. 15a for a test case with ''2160^3^'' grid points and the FFT Poisson solver. Tests have been performed on the Cray-XC40 of the North-German Computing Alliance (HLRN). The machine has ''1128'' compute nodes, each equipped with two 12-core Intel-Haswell CPUs, plus 744 compute nodes equipped with two 12-core Intel-Ivy Bridge CPUs, and an Aries-interconnect. Additionally, runs with ''4320^3^'' grid points have been carried out with up to ''43200'' cores, starting with a minimum of ''11520'' cores (see Fig. 15b). Runs with less cores could not be carried out as the data would not have fit into the memory.
    1515
    16 [[Image(11.png, 600px, border=1)]]
     16[[Image(12.png, 600px, border=1)]]
    1717
    1818Figure 15: Scalability of PALM 4.0 on the Cray XC30 supercomputer of HLRN. Simulations were performed with a computational grid of '''(a)''' ''2160^3^'' and '''(b)''' ''4320^3^'' grid points (Intel-Ivy Bridge CPUs). '''(a)''' shows data for up to ''11520'' PEs with cache (red lines) and vector (blue lines) optimization and overlapping during the computation (FFT and tri-diagonal equation solver, see this section) enabled (dashed green lines). Measurement data are shown for the total CPU time (crosses), the prognostic equations (circles), and for the pressure solver (boxes). '''(b)''' shows data for up to 43200 PEs and with both cache optimization and overlapping enabled. Measurement data is shown for the total CPU time (gray line), pressure solver (blue line), prognostic equations (red line), as well as the MPI calls '''''MPI_ALLTOALL''''' (brown line) and '''''MPI_SENDRCV''''' (purple line).