Changes between Version 11 and Version 12 of doc/app/runs
- Timestamp:
- Sep 19, 2018 2:34:15 PM (6 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
doc/app/runs
v11 v12 1 1 2 = Initialization and restart runs = 2 = Restart runs / Handling of large (restart) files = 3 4 Batch systems generally limit the CPU time that is allowed to be requested by a job, e.g. to a maximum of 12 hours or 24 hours. If a simulation needs more time to run, it has to be split into several parts/jobs. The first job is called the ''initial'' job, the others ''restart'' jobs. Together they form a so-called ''job chain''. Restart jobs require as input the values of all flow variables after the final time step of the previous job. They need to be output by the previous job in a so-called ''restart-file''. 5 6 {{{palmrun}}} allows you to automatically generate job chains and to handle the restart files. Of course, automatic generation does not work if you run PALM in interactive mode. The following chapter describes 3 7 4 8 A job started by '''[../../app/jobcontrol mrun]''' will - according to its requested computing time, its memory size requirement and the number of necessary processing elements (on parallel computers) - be queued by the queuing-system of the local or remote computer into a suitable job class which fulfills these requirements. Each job class permits only jobs with certain maximum requirements (e.g. the allowed CPU time or the maximum number of cores that can be used by the job). The job classes are important for the scheduling process of the computer. Jobs with small requirements usually come to execution very fast, jobs with higher requirements must wait longer (sometimes several days).\\\\ … … 41 45 Therefore restart jobs can not only be started automatically through '''mrun''', but also manually by the user. This is necessary e.g. whenever after the end of a job chain it is decided that the simulation must be continued further, because the phenomenon which should be examined did not reach the desired state yet. In such cases the '''mrun''' options completely correspond to those of the initial call; simply the {{{"#"}}} characters in the arguments of options {{{-r}}}, {{{-i}}} and {{{-o}}} must be replaced by {{{"f"}}}.\\\\ 42 46 43 = Handling of large binary restart- or output-files =47 = Handling of large (restart) files = 44 48 45 49 In case of very large files, the copy of data from and to '''mrun's''' temporary working directory may need a long time. The CPU cores requested for the job run idle during that time and may consume significant amount of the job time without doing anything. The time required for copying can be spared by using a file link instead of copying the data. … … 69 73 }}} 70 74 Such fast file systems are generally not allowed to store files for a longer time, so the user has to take care for archiving himself. 71 72 '''Attention:'''\\73 The {{{ln}}} file attribute and the above described method for storing large binary files has been introduced with revision number r2262. {{{mrun}}} does not create empty files/directories for restart files in the folder given with the file connection statement any more, as it was done with feature "fl" before.