Changes between Version 22 and Version 23 of doc/app/runs


Ignore:
Timestamp:
Apr 20, 2021 3:24:23 PM (4 years ago)
Author:
raasch
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • doc/app/runs

    v22 v23  
    6363= Handling of large (restart) files =
    6464
    65 In case of very large files, the copy of data from and to '''palmrun's''' temporary working directory may need a long time. The CPU cores requested for the job run idle during that time and may consume significant amount of the job time without doing anything. The time required for copying can be spared by using a file link instead of copying the data.
     65Copying very large files like restart data files to and from '''palmrun's''' temporary working directory may need much time. During that time the requested cores for the job run idle and may consume significant amount of the job time without doing anything. The copy time can be spared by using a file link instead of copying the data.
    6666{{{
    6767   cp large_local_file  large_permanent file                                 # may take long time
    6868   ln existing_large_local_TARGET_file  LINK_NAME_to_large_local_file        # is done immediately, i.e. requires almost no time
    6969}}}
    70 You can tell '''palmrun''' to use {{{ln}}} instead of {{{cp}}} by giving the file attribute {{{ln}}} in the respective file connection statement, e.g.:
     70You can tell '''palmrun''' to use {{{ln}}} instead of {{{cp}}} by setting the file attribute {{{ln}}} in the respective file connection statement, e.g.:
    7171{{{
    72 BININ   in:loc:lnpe  d3r       $base_data/$run_identifier/RESTART  _d3d
    73 BINOUT  out:loc:lnpe restart   $base_data/$run_identifier/RESTART _d3d
     72BININ   in:lnpe   d3r      $restart_data_path/$run_identifier/RESTART _d3d*
     73BINOUT* out:lnpe  restart  $restart_data_path/$run_identifier/RESTART _d3d
    7474}}}
    75 However, performing a link requires that the link to a TARGET file with the name LINK_NAME must be located on the same physical file system as the TARGET file. If TARGET file and LINK_NAME are on different file systems, the TARGET file will be copied instead (and the advantage of using the {{{ln}}} command is lost).
     75However, generating a link requires that both the target as well as the linked file are located on the same physical file system. Otherwise, a normal copy will be done instead and the advantage of using the {{{ln}}} command is lost.
    7676
    77 Most computing centers provide a file systems for fast I/O and this should be used as '''palmrun's''' temporary working directory, which can be set in the configuration file by environment variable {{{tmp_user_catalog}}}. Since the LINK_NAME should be on the same file system, the user should provide a directory on that file system for storing the large files. Respective settings in the configuration file could be (example for Cray-XC40 at HLRN):
    78 {{{
    79 #
    80 # folder in which palmrun's temporary working catalog is created (will be deleted after end of job)
    81 %tmp_user_catalog    /gfs2/work/<replace by username>     lccrayh parallel
    82 #
    83 # folder in which large binary files shall be stored
    84 %tmp_data_catalog    /gfs2/work/<replace by username>     lccrayh parallel
    85 #
    86 # file connection statements for restart files
    87 BININ   in:loc:lnpe  d3r       $tmp_data_catalog/$run_identifier/RESTART  _d3d
    88 BINOUT  out:loc:lnpe restart   $tmp_data_catalog/$run_identifier/RESTART  _d3d
    89 }}}
    90 Such fast file systems are generally not allowed to store files for a longer time, so the user has to take care for archiving himself.
     77Most computing centers provide a file systems for fast I/O and this should be used as '''palmrun's''' temporary working directory, which you can set in via environment variable {{{restart_data_path}}} in the configuration file. Since the LINK_NAME should be on the same file system, the user should provide a directory on that file system for storing the large files.