PALM coding rules


(0) Why to follow some standards?

Because everyone has her/his own programming style, sort of a dialect, making it difficult for other developers to understand, extend, debug, or optimize the code. So what do we do? We learn and apply the coding standard to make PALM more easily readable for all current and future developers. Let's all work on that together. We are aware that the PALM core doesn't completely comply with the following rules yet, but we are working on that.

Formulated rules are based on the specifications given for the ocean dynamics model NEMO. Other work that influenced the development of this standard are Community Climate System Model Software Developer's Guide, and European Standards For Writing and Documenting Exchangeable Fortran 90 Code.


(1) General rules

In all code files, independent of the programming language used:

  • Use English language only
  • Use ASCII characters only
  • Use SI units for physical quantities
  • NO use of tab characters (only exception: Makefile), because depending on the text editor, the code indentations might be corrupted.

(2) Python

For Python code, we follow PEP 8.

(3) Fortran

(3.1) General hints & requirements

(3.1.1) Language

  • Fortran-2003 standard (all Fortran compilers can handle this, Fortran-2008 standard not yet supported by all compilers)
  • Line-length limit
    • 100 characters (soft limit) for optimal readability
    • 132 characters (hard limit) absolute limit for most compilers!
AVOID USE INSTEAD
COMMON blocks ...

(3.1.2) Implementing new features to PALM

(will follow)

  • module structure (attach a template or link to template in repo)
  • available interfaces in the PALM core

(3.2) Coding

(3.2.1) Code structure

  • One MODULE per file (only exception: modules.f90)
  • Clarify program entities, i.e. use SUBROUTINE <name> ... END SUBROUTINE <name>, same holds for INTERFACE, MODULE, PROGRAM statements

(3.2.2) Variable & parameter declarations

  • Clear structure in declaration part, in this order: USE --> IMPLICIT NONE --> declarations --> SAVE --> PRIVATE --> PUBLIC list_of_public_variables
  • IMPLICIT NONE | All subroutines and functions must include this statement, i.e all variables must be explicitly typed. It also allows the compiler to detect typographical errors in variable names. For MODULEs, one IMPLICIT NONE statement in the modules definition section is sufficient.
  • PARAMETER declaration | Variables used as constants should be declared with attribute PARAMETER and used always without copying to local variables. This prevents from using different values for the same constant or changing them accidentally.
  • DIMENSION statement | or attribute is required in declaration statements.
  • :: notation | is quite useful to show that this program unit declaration part is written in standard Fortran syntax, even if there are no attributes to clarify the declaration section. Always use the notation <blank>::<two blanks> to improve readability.
  • LEN specifier | Declare the length of a character variable using the CHARACTER (LEN=xxx) syntax - the LEN specifier is important because it is possible to have several kinds for characters (e.g. Unicode using two bytes per character, or there might be a different kind for Japanese e.g. NEC).
  • Precision | Parameters and variables should not rely on vendor-supplied flags to supply a default floating point precision or integer size. The F2003 KIND feature should be used instead. Always use the default REAL- (wp) and INTEGER- (iwp) working precision in the respective declaration statements:
    INTEGER(iwp) ::  counter
    
    REAL(wp)     ::  vara
    
    The working precisions are set in module file mod_kinds. The defaults are 8 Byte REAL (double precision) and 4 Byte INTEGER (single precision). Only set distinct precisions if they are absolutely required, e.g. for INTEGER counters which may become larger than 232-1 or for REAL quantities which are required in single precision:
    INTEGER(idp) ::  particle_id   !< particle counter
    
    REAL(sp), DIMENSION(:,:,:), ALLOCATABLE ::  lpf  !< array for NetCDF output
    
  • INTENT clause | All dummy arguments must include the INTENT clause in their declaration. This is extremely valuable to someone reading the code, and can be checked by compilers. A common mistake is to put the wrong type of variable in a routine call. So, develop the habit of checking types of arguments in parameter lists. Many modern compilers, especially for Fortran 2003, check for consistent use within a file or across files using inter-procedural analysis. Compilers for Fortran 2003 will also flag up errors at link time if there are explicit or implicit interfaces.
  • PRIVATE attribute | Modules variables and routines should be encapsulated by using the PRIVATE attribute. What shall be used outside the module can be declared PUBLIC instead. Use USE with the ONLY attribute to specify which of the variables, type definitions etc. defined in a module are to be made available to the using routine. Of course you do not need to add the ONLY attribute if you include all or nearly all public declarations of a module.
  • Data initialization | Improper data initialization is another common source of errors. A variable could contain an initial value you did not expect. This can happen for several reasons, e.g. the variable has never been assigned a value, its value is outdated, memory has been allocated for a pointer but you have forgotten to initialize the variable pointed to. Some compilers initialize variables to zero but when you port your code to another computer that does not do this previously working code will no longer work this can take some time to diagnose and longer to resolve. To avoid such mishaps, initialize variables as close as possible to where they are first used. If possible, give a default initial value in the declaration statement.
  • Constants and magic numbers | Magic numbers should be avoided. Physical constants (e.g. pi, gas constants) must never be hardwired into the executable portion of a code. Instead, a mnemonically named variable or parameter should be set to the appropriate value, probably in the setup routine for the package. We realize that many parametrizations rely on empirically derived constants or fudge factors, which are not easy to name. In these cases it is not forbidden to leave such factors coded as magic numbers buried in executable code, but comments should be included referring to the source of the empirical formula. Hard-coded numbers should never be passed through argument lists. One good reason for this rule is that a compiler flag, which defines a default precision for constants, cannot be guaranteed. Fortran 2003 allows specification of the precision of constants through the "_" compile-time operator (e.g. 3.14_dp or 365_i8). So if you insist on passing a constant through an argument list, you must also include a precision specification in the calling routine. If this is not done, a called routine that declares the resulting dummy argument as, say, real(dp) or 8 bytes, will produce erroneous results if the default floating point precision is 4 byte.
  • INTERFACE blocks | Explicit interface blocks are required between routines if optional or keyword arguments are to be used. They also allow the compiler to check that the type, shape and number of arguments specified in the CALL are the same as those specified in the subprogram itself. Fortran 2003 compilers can automatically provide explicit INTERFACE blocks for routines contained in a MODULE.

(3.2.3) Pre-processor directives

PALM works with the C Pre-Processor (CPP), available on any UNIX platform, and covered by most Fortran compilers. The use of pre-processor directives inside the code allow to specifically exclude or include parts of the code for compilation. Only few pre-processor directives are used in PALM (list of directives), and activated by the %cpp_options variable in the .palm.config.<configuration_identifier> file.

In the code, use this syntax (starting at first character of a line):

#if defined( __parallel )
    some code
#endif

together with the standard logical operators ! (instead of .NOT.), || (instead of .OR.), && (instead of .AND.), e.g.

#if ! defined( __parallel ) && ( __netcdf )

(3.2.4) Allowed operators

  • Use /=, <, <=, ==, >, >=, etc. as relational operators instead of .GE., .LT., etc.
  • Use .AND., .OR., .NOT. as logical operators

(3.2.5) Error messages

(will follow)

  • Use message routine (explain parameters here...)
  • I/O error conditions via IOSTAT (is this fail-safe for different compilers?)

(3.2.6) Code optimization

(will follow)


(3.3) Documenting & commenting


Fig. 1 PALM source-code header example. Click to enlarge.

Documentation consists of putting information both internally and externally of the source code. Comments should give a good idea of what the code does and where to look for any special activity. PALM supports the use of Doxygen, a tool for generating documentation and flow charts from annotated source code. This requires some special formatting, as described below.

(3.3.1) External documentation

You are in the middle of it. We have an extensive online documentation wiki embedded into a trac project management system, directly connected to the svn repository, allowing to [browser browse the PALM code] @ all its former and of course current revision in a web-based environment.

Your carefully developed code can only be used, if there is a documentation that tells the PALM user how to use and steer the feature. All pages can be accessed from the main Documentation page (see index on the left). There are pages for model/code description, and user-manual pages that explain about model steering, data analysis and debugging. Have a look at and discuss with us where your documentation material fits in best.

(3.3.2) Internal documentation

File header section (see Fig. 1)

  1. Doxygen command for Fortran file name (!< starts a doxygen command line)
  2. PALM license section (you may have to add other licenses if allowedly implementing code from other models)
  3. Revision comments (see working with svn branches and committing to trunk)
  4. Doxygen command for listing authors of the module
  5. Module description (give a brief description of about the modules purpose)
  6. Another set of doxygen commands for listing todos, notes and know bugs

Description of variables (one variable per declaration statement!)

  • Short description for each declared variable, comments should be aligned per declaration block
  • At minimum, 2 whitespace before !<
  • !< character introduces doxygen comment
    LOGICAL ::  conserve_water_content = .TRUE.  !< open or closed bottom surface for the soil model
    LOGICAL ::  constant_roughness = .FALSE.     !< use fixed/dynamic roughness lengths for water surfaces
    
    REAL(wp) ::  c_surface = 20000.0_wp                !< Surface (skin) heat capacity (J m-2 K-1)
    REAL(wp) ::  deep_soil_temperature = 9999999.9_wp  !< Deep soil temperature (K)
    
  • If comment reaches over 132 characters (hard limit), break comment into (max.) 2 lines
    LOGICAL ::  conserve_water_content = .TRUE.  !< open or closed bottom surface for the soil model
                                                 !< in the land-surface model
    

Commenting of code blocks

  • Carefully comment each block of code
  • Any violations of good programming style should be justified with a comment. This will ensure that any well-intentioned programmer does not break your code by changing the source to implement a “better” style.
  • Comments are especially helpful at the end of long or nested loops, e.g. comment end of each loop to mark which loop has ended and thus ease the clarity of the code.
  • Lines before a control structure, e.g. IF, CASE, DO, etc., are a natural spot to comment and explain what these constructs are about to do.
  • Comments always start with
    !
    !--
    
  • Comment text always starts at same position as the indented code to be described, and break text into several lines if > 100 characters are reached
           DO  j = nys, nyn
    !
    !--       Tendency terms for u-velocity component. Please note, in case of 
    !--       non-cyclic boundary conditions the grid point i=0 is excluded from
    !--       the prognostic equations for the u-component.    
              IF ( i >= nxlu )  THEN
    
                 tend(:,j,i) = 0.0_wp
                 IF ( timestep_scheme(1:5) == 'runge' )  THEN
    
  • Avoid superfluous comments like
    !
    !-- Loop over all grid indices
        DO  i = nxl, nxr
           DO  j = nys, nyn
    


(3.4) Naming conventions

(3.4.1) Use of lower & upper case letters

  • Upper case: Fortran keywords and intrinsic functions or routines, e.g.
    • SUBROUTINE, MODULE, etc.
    • INTEGER, REAL, PARAMETER, etc.
    • DO, ENDDO, IF, ELSE, etc.
    • READ, WRITE, CALL, etc.
    • MPI_ALLTOALL (MPI functions), NF90_CREATE (NetCDF functions), etc.

  • Lower case: Everything else!

(3.4.2) Names for routines and variables

Use clear, unambiguous naming in lower-case letters, with individual words separated by underscore.

  • MODULE/SUBROUTINE: name is constructed (if applicable) as verb followed by object, e.g.
    land_surface_model or read_restart_data
  • MODULE/SUBROUTINE files: <filename>.f90 --> MODULE <filename> ... END MODULE <filename> (simplification of Make process)
  • Functions: names provide information about the value it is expected to return, e.g.
    solar_zenith_angle( )
  • Variables & constants: names are readable, memorable and descriptive
  • LOGICAL (boolean) variables: give names that imply
    TRUE or FALSE

PROHIBITED:

  • Names that may clash with the operating system or language intrinsic, e.g.
    read( ) or access( )
  • One-letter variable names (only allowed for basic flow variables like velocities (u, v, w), humidity (q) or turbulent kinetic energy (e), as well as loop or other counters (e.g. i, j, k))


(3.5) Formatting & sorting

Line length limit: 100 characters (soft limit), 132 characters (hard limit)

(3.5.1) Indentation, spaces & line breaks

General module/subroutine structure (see Fig. 2)


Fig. 2 Indentation example with highlighted whitespaces. Click to enlarge.

  • 0 whitespace in front of pre-processor directives
  • 1 whitespace before MODULE, CONTAINS, SUBROUTINE (= first indentation level)
  • +3 whitespaces for all following indentation levels
    (only exception: ONLY list in USE statements +4 whitespaces)

  • 1 whitespace between individual strings, and between strings and symbols/numbers
  • 1 whitespace before and after all operators (+, -, *, =, /)
    (exception: 0 whitespace for ** (exponent operator, e.g. u**2) and operators within expressions for array indices, e.g. u(i+1))
  • 1 whitespace after ,
    (only exception: 0 whitespace between array dimensions, e.g. (i,j,k))
  • 1 whitespace before ::
    (at minimum, see Sect. Alignment)
  • 2 whitespace after : and ::

  • 1 blank line between enclosed/unrelated blocks of instructions, assignments, clauses, statements, etc.
  • 2 blank lines in front of each SUBROUTINE within a MODULE

Whitespaces between brackets (see Fig. 2)

  • 0 whitespace between variable/function/declaration and ( , e.g.
    • INTEGER, REAL, PARAMETER, etc.
    • READ, WRITE, ALLOCATE, etc.
  • 1 whitespace after ( and before )
    (only exception: 0 whitespace for brackets containing array indices)













Whitespaces in DO loops, IF blocks, CASE structures (see Fig. 3)


Fig. 3 Indentation & whitespaces in loops. Click to enlarge.

  • In general: 1 whitespace everywhere and 3 whitespace for each indentation level
  • But: 2 whitespace
    • between DO and loop index
    • after IF ( ) --> see CALL, THEN
    • in front of and after logical operators (e.g. .OR., see list of allowed operators)






(3.5.2) Alignment

(see Fig. 4)


Fig. 4 Alignment example. Click to enlarge.

  • Block-wise alignment of continuation line mark &
  • Alignment of ONLY lists
  • At least block-wise alignment for same type/group of declaration statement
  • Alignment of message_string values
  • Alignment of expressions between brackets (e.g. in IF ( ) THEN or in argument list of subroutine calls)
  • Alignment of related code, e.g. in complex equations or setting of initial values for variables

(3.5.3) Alphabetical sorting

  • Members in ONLY lists of USE statements (see e.g. Fig. 4)
  • Parameters in Fortran NAMELIST lists (see e.g. initialization_parameters namelist in parin.f90)
  • Declaration types (first CHARACTER, then INTEGER, etc., see Fig. 4)
  • Variables in declaration statements (see Fig. 4)

(3.5.4) Handling of lists of variables

  • In namelists, one variable per line:
    NAMELIST /abcde/  var_a,  &
                      var_b,  &
                      ...
    
  • In argument lists, consecutive up to the allowed line width:
    SUBROUTINE sub_a( var_a, var_b, ... )
    


















(3.6) Final steps

Good practice

  • No warning/error message should remain during compile (also try debug options)

Clean up

  • Remove PRINT/WRITE statements for used for debugging
  • Check that all parameters are used
  • Check that you have applied all the above listed rules
Last modified 3 years ago Last modified on Jan 8, 2021 3:44:15 PM

Attachments (5)

Download all attachments as: .zip