= PALM coding rules =

\\
= (0) Why to follow some standards? =
Because everyone has her/his own programming style, sort of a dialect, making it difficult for other developers to understand, extend, debug, or optimize the code. So what do we do? We learn and apply the coding standard to make PALM more easily readable for all current and future developers. Let's all work on that together. We are aware that the PALM core doesn't completely comply with the following rules yet, but we are working on that. 
\\\\
Formulated rules are based on the specifications given for the ocean dynamics model [https://www.nemo-ocean.eu/ NEMO].
Other work that influenced the development of this standard are [http://www.cesm.ucar.edu/models/ccsm4.0/ Community Climate System Model] [http://www.cesm.ucar.edu//working_groups/Software/dev_guide/dev_guide/ Software Developer's Guide], and [attachment:european_standards-1.pdf European Standards For Writing and Documenting Exchangeable Fortran 90 Code].

\\
= (1) General rules =

In all code files, independent of the programming language used:

* Use '''English''' language only
* Use '''ASCII''' characters only
* Use '''SI units''' for physical quantities
* '''NO''' use of tab characters (only exception: Makefile), because depending on the text editor, the code indentations might be corrupted.


= (2) Python =

For Python code, we follow [https://www.python.org/dev/peps/pep-0008 PEP 8].


= (3) Fortran =

== (3.1) General hints & requirements ==

=== (3.1.1) Language ===

* '''Fortran-2003''' standard (all Fortran compilers can handle this, Fortran-2008 standard not yet supported by all compilers)
* Line-length limit 
  * '''100 characters (soft limit)''' for optimal readability
  * '''132 characters (hard limit)''' absolute limit for most compilers!
|| '''AVOID''' || '''USE INSTEAD''' ||
|| COMMON blocks || ... ||
{{{#!comment
Examples for Fortran 95 are:
        ◦ COMMON blocks - use the declaration part of MODULEs instead.
        ◦ EQUIVALENCE - use POINTERs or derived data types instead to form data structures. Please try to avoid this anyway as it is usually a source of bugs.
        ◦ Assigned and computed GOTOs - use the CASE construct instead.
        ◦ Arithmetic IF statements - use the block IF, ELSE, ELSEIF, ENDIF or SELECT CASE construct instead.
        ◦ Labeled DO constructs - use unlabeled ENDDO instead. Nevertheless non-number label can be used for big iterative loop of recurcive algorithm.
        ◦ I/O routines END and ERR - use IOSTAT instead (the use is somehow restricted due to compiler implementation dependent error numbering). (This is not yet done throughout the PALM code  action!)
        ◦ FORMAT statements: use character parameters or explicit format- specifiers inside the READ or WRITE statement instead. (This is not yet done throughout the PALM code  action!)
        ◦ GOTO and CONTINUE statement - use IF, CASE, DO WHILE, EXIT or CYCLE statements or a contained SUBROUTINE instead. If you feel you cannot avoid a GOTO and or CONTINUE statement, then add a clear comment to explain what is going on and why you need to use GOTO.
        ◦ PAUSE – just never use it.
        ◦ ENTRY statements: a subprogram must only have one entry point.
        ◦ RETURN – it is obsolete and so not necessary at the end of program units.
        ◦ Fixed source form – use free form instead.
        ◦ Avoid functions with side effects. There are good reasons to avoid this. First, the code is easier to understand, if you can rely on the rule that functions don't change their arguments, second, some compilers generate more efficient code for PURE (in Fortran 2003 there are the attributes PURE and ELEMENTAL) functions, because they can store the arguments in different places. This is especially important on massive parallel and on vector machines as well.
        ◦ DATA and BLOCK DATA - initialisers in Fortran 2003 give this functionality.
}}}

=== (3.1.2) Implementing new features to PALM ===
(will follow)
* module structure (attach a template or link to template in repo)
* available interfaces in the PALM core


== (3.2) Coding ==

=== (3.2.1) Code structure ===
* One MODULE per file (only exception: modules.f90)
* Clarify program entities, i.e. use {{{SUBROUTINE <name> ... END SUBROUTINE <name>}}}, same holds for INTERFACE, MODULE, PROGRAM statements

=== (3.2.2) Variable & parameter declarations ===
* Clear structure in declaration part, in this order: USE --> IMPLICIT NONE --> declarations --> SAVE --> PRIVATE --> PUBLIC list_of_public_variables
* '''IMPLICIT NONE''' | All subroutines and functions must include this statement, i.e all variables must be explicitly typed. It also allows the compiler to detect typographical errors in variable names. For MODULEs, one IMPLICIT NONE statement in the modules definition section is sufficient.
* '''PARAMETER declaration''' | Variables used as constants should be declared with attribute PARAMETER and used always without copying to local variables. This prevents from using different values for the same constant or changing them accidentally.
* '''DIMENSION statement''' | or attribute is required in declaration statements.
* ''' :: notation''' | is quite useful to show that this program unit declaration part is written in standard Fortran syntax, even if there are no attributes to clarify the declaration section. Always use the notation <blank>::<two blanks> to improve readability.
* '''LEN specifier''' | Declare the length of a character variable using the CHARACTER (LEN=xxx) syntax - the LEN specifier is important because it is possible to have several kinds for characters (e.g. Unicode using two bytes per character, or there might be a different kind for Japanese e.g. NEC).
* '''Precision''' | Parameters and variables should not rely on vendor-supplied flags to supply a default floating point precision or integer size. The F2003 KIND feature should be used instead. Always use the default REAL- (wp) and INTEGER- (iwp) working precision in the respective declaration statements:
{{{
INTEGER(iwp) ::  counter

REAL(wp)     ::  vara
}}}
  The working precisions are set in module file mod_kinds. The defaults are 8 Byte REAL (double precision) and 4 Byte INTEGER (single precision). Only   set distinct precisions if they are absolutely required, e.g. for INTEGER counters which may become larger than 232-1 or for REAL quantities which are required in single precision:
  {{{
  INTEGER(idp) ::  particle_id   !< particle counter

  REAL(sp), DIMENSION(:,:,:), ALLOCATABLE ::  lpf  !< array for NetCDF output
  }}}
* '''INTENT clause''' | All dummy arguments must include the INTENT clause in their declaration. This is extremely valuable to someone reading the code, and can be checked by compilers. A common mistake is to put the wrong type of variable in a routine call. So, develop the habit of checking types of arguments in parameter lists. Many modern compilers, especially for Fortran 2003, check for consistent use within a file or across files using inter-procedural analysis. Compilers for Fortran 2003 will also flag up errors at link time if there are explicit or implicit interfaces.
* '''PRIVATE attribute''' | Modules variables and routines should be encapsulated by using the PRIVATE attribute. What shall be used outside the module can be declared PUBLIC instead. Use USE with the ONLY attribute to specify which of the variables, type definitions etc. defined in a module are to be made available to the using routine. Of course you do not need to add the ONLY attribute if you include all or nearly all public declarations of a module.
* '''Data initialization''' | Improper data initialization is another common source of errors. A variable could contain an initial value you did not expect. This can happen for several reasons, e.g. the variable has never been assigned a value, its value is outdated, memory has been allocated for a pointer but you have forgotten to initialize the variable pointed to. Some compilers initialize variables to zero but when you port your code to another computer that does not do this previously working code will no longer work this can take some time to diagnose and longer to resolve. To avoid such mishaps, '''initialize''' variables as close as possible to where they are first used. If possible, give a default initial value in the declaration statement.
* '''Constants and magic numbers''' | Magic numbers should be avoided. Physical constants (e.g. pi, gas constants) must never be hardwired into the executable portion of a code. Instead, a mnemonically named variable or parameter should be set to the appropriate value, probably in the setup routine for the package. We realize that many parametrizations rely on empirically derived constants or fudge factors, which are not easy to name. In these cases it is not forbidden to leave such factors coded as magic numbers buried in executable code, but comments should be included referring to the source of the empirical formula. Hard-coded numbers should never be passed through argument lists. One good reason for this rule is that a compiler flag, which defines a default precision for constants, cannot be guaranteed. Fortran 2003 allows specification of the precision of constants through the "_" compile-time operator (e.g. 3.14_dp or 365_i8). So if you insist on passing a constant through an argument list, you must also include a precision specification in the calling routine. If this is not done, a called routine that declares the resulting dummy argument as, say, real(dp) or 8 bytes, will produce erroneous results if the default floating point precision is 4 byte.
* '''INTERFACE blocks''' | Explicit interface blocks are required between routines if optional or keyword arguments are to be used. They also allow the compiler to check that the type, shape and number of arguments specified in the CALL are the same as those specified in the subprogram itself. Fortran 2003 compilers can automatically provide explicit INTERFACE blocks for routines contained in a MODULE.

=== (3.2.3) Pre-processor directives ===
PALM works with the C Pre-Processor (CPP), available on any UNIX platform, and covered by most Fortran compilers. The use of pre-processor directives inside the code allow to specifically exclude or include parts of the code for compilation.
Only few pre-processor directives are used in PALM ([wiki:doc/app/cpp_options list of directives]), and activated by the {{{%cpp_options}}} variable in the .palm.config.<configuration_identifier> file.

In the code, use this syntax (starting at first character of a line):\\
{{{
#if defined( __parallel )
    some code
#endif
}}}
together with the standard logical operators {{{!}}} (instead of .NOT.), {{{||}}} (instead of .OR.), {{{&&}}} (instead of .AND.), e.g.
{{{
#if ! defined( __parallel ) && ( __netcdf )
}}}

=== (3.2.4) Allowed operators ===
* Use /=, <, <=, ==, >, >=, etc. as relational operators instead of .GE., .LT., etc.
* Use .AND., .OR., .NOT. as logical operators

=== (3.2.5) Error messages ===
(will follow)
* Use message routine (explain parameters here...)
* I/O error conditions via IOSTAT (is this fail-safe for different compilers?)

=== (3.2.6) Code optimization ===
(will follow)

\\
== (3.3) Documenting & commenting ==
{{{
#!div style="align:'left'; width: 450px; border: 0px solid; float:right"
[[Image(rules_header.png, 450px, right, margin-right=2, margin-top=0, border=2)]]\\
'''Fig. 1''' PALM source-code header example. Click to enlarge.
}}}
Documentation consists of putting information both internally and externally of the source code. Comments should give a good idea of what the code does and where to look for any special activity. PALM supports the use of [wiki:doc/tec/doxygen Doxygen], a tool for generating documentation and flow charts from annotated source code. This requires some special formatting, as described below.

=== (3.3.1) External documentation ===
You are in the middle of it. We have an extensive online documentation wiki embedded into a [https://trac.edgewall.org/ trac] project management system, directly connected to the svn repository, allowing to [browser browse the PALM code] @ all its former and of course current revision in a web-based environment.\\\\
Your carefully developed code can only be used, if there is a documentation that tells the PALM user how to use and steer the feature. All pages can be accessed from the main [wiki:doc Documentation] page (see index on the left). There are pages for [wiki:doc#Modeldescription model/code description], and [wiki:doc#Usermanual user-manual] pages that explain about model steering, data analysis and debugging. Have a look at and discuss with us where your documentation material fits in best.

=== (3.3.2) Internal documentation ===
'''File header section'''
(see Fig. 1)
a. Doxygen command for Fortran file name ('''!<''' starts a doxygen command line)
b. PALM license section (you may have to add other licenses if allowedly implementing code from other models)
c. Revision comments (see [wiki:doc/tec/developerrules/svnbranch working with svn branches] and [wiki:doc/tec/developerrules committing to trunk])
d. Doxygen command for listing authors of the module
e. Module description (give a brief description of about the modules purpose)
f. Another set of doxygen commands for listing todos, notes and know bugs

'''Description of variables'''
(one variable per declaration statement!)
* Short description for each declared variable, comments should be aligned per declaration block
* At minimum, 2 whitespace before '''!<'''
* '''!<''' character introduces doxygen comment
{{{
LOGICAL ::  conserve_water_content = .TRUE.  !< open or closed bottom surface for the soil model
LOGICAL ::  constant_roughness = .FALSE.     !< use fixed/dynamic roughness lengths for water surfaces

REAL(wp) ::  c_surface = 20000.0_wp                !< Surface (skin) heat capacity (J m-2 K-1)
REAL(wp) ::  deep_soil_temperature = 9999999.9_wp  !< Deep soil temperature (K)
}}}
* If comment reaches over 132 characters (hard limit), break comment into (max.) 2 lines
{{{
LOGICAL ::  conserve_water_content = .TRUE.  !< open or closed bottom surface for the soil model
                                             !< in the land-surface model
}}}

'''Commenting of code blocks'''\\
* Carefully comment each block of code
* Any violations of good programming style should be justified with a comment. This will ensure that any well-intentioned programmer does not break your code by changing the source to implement a “better” style. 
* Comments are especially helpful at the end of long or nested loops, e.g. comment end of each loop to mark which loop has ended and thus ease the clarity of the code.
* Lines before a control structure, e.g. IF, CASE, DO, etc., are a natural spot to comment and explain what these constructs are about to do.
* Comments '''always''' start with
{{{
!
!--
}}}
* Comment text always starts at same position as the indented code to be described, and break text into several lines if '''> 100 characters''' are reached
{{{
       DO  j = nys, nyn
!
!--       Tendency terms for u-velocity component. Please note, in case of 
!--       non-cyclic boundary conditions the grid point i=0 is excluded from
!--       the prognostic equations for the u-component.    
          IF ( i >= nxlu )  THEN

             tend(:,j,i) = 0.0_wp
             IF ( timestep_scheme(1:5) == 'runge' )  THEN
}}}
* Avoid superfluous comments like
{{{
!
!-- Loop over all grid indices
    DO  i = nxl, nxr
       DO  j = nys, nyn
}}}

\\
== (3.4) Naming conventions ==
=== (3.4.1) Use of lower & upper case letters ===
* '''Upper case:''' Fortran keywords and intrinsic functions or routines, e.g.
  * SUBROUTINE, MODULE, etc.
  * INTEGER, REAL, PARAMETER, etc.
  * DO, ENDDO, IF, ELSE, etc.
  * READ, WRITE, CALL, etc.
  * MPI_ALLTOALL (MPI functions), NF90_CREATE (NetCDF functions), etc.\\\\
* '''Lower case:''' Everything else!

=== (3.4.2) Names for routines and variables ===
Use clear, unambiguous naming in '''lower-case letters''', with  individual words separated by underscore.
* MODULE/SUBROUTINE: name is constructed (if applicable) as verb followed by object, e.g.\\
  {{{land_surface_model}}} or {{{read_restart_data}}}
* MODULE/SUBROUTINE files: <filename>.f90 --> MODULE <filename> ... END MODULE <filename> (simplification of Make process)
* Functions: names provide information about the value it is expected to return, e.g.\\
  {{{solar_zenith_angle( )}}}
* Variables & constants: names are readable, memorable and descriptive
* LOGICAL (boolean) variables: give names that imply\\
  {{{TRUE}}} or {{{FALSE}}}

'''PROHIBITED:'''
* Names that may clash with the operating system or language intrinsic, e.g.\\
  {{{read( )}}} or {{{access( )}}}
* One-letter variable names (only allowed for basic flow variables like velocities (u, v, w), humidity (q) or turbulent kinetic energy (e), as well as loop or other counters (e.g. i, j, k))

\\
== (3.5) Formatting & sorting ==
[[NoteBox(warn, Line length limit: '''100''' characters ('''soft limit''')\, '''132''' characters ('''hard limit'''))]]

=== (3.5.1) Indentation, spaces & line breaks ===
'''__General module/subroutine structure'''
(see Fig. 2)
{{{
#!div style="align:'left'; width: 450px; border: 0px solid; float:right"
[[Image(rules_indent_general.png, 450px, right, margin-right=2, margin-top=0, border=2)]]\\
'''Fig. 2''' Indentation example with highlighted whitespaces. Click to enlarge.
}}}
* '''0 whitespace''' in front of pre-processor directives
* '''1 whitespace''' before {{{MODULE}}}, {{{CONTAINS}}}, {{{SUBROUTINE}}} (= first indentation level)
* '''+3 whitespaces''' for all following indentation levels \\(only exception: {{{ONLY}}} list in {{{USE}}} statements '''+4 whitespaces''')\\\\
* '''1 whitespace''' between individual strings, and between strings and symbols/numbers
* '''1 whitespace''' before and after all operators (+, -, *, =, /)\\(exception: '''0 whitespace''' for {{{**}}} (exponent operator, e.g. {{{u**2}}}) and operators within expressions for array indices, e.g. {{{u(i+1)}}})
* '''1 whitespace''' after {{{,}}}\\(only exception: '''0 whitespace''' between array dimensions, e.g. {{{(i,j,k)}}})
* '''1 whitespace''' before {{{::}}} \\(at minimum, see Sect. [#align Alignment])
* '''2 whitespace''' after {{{:}}} and {{{::}}}\\\\
* '''1 blank line''' between enclosed/unrelated blocks of instructions, assignments, clauses, statements, etc.
* '''2 blank lines''' in front of each {{{SUBROUTINE}}} within a MODULE

'''__Whitespaces between brackets'''
(see Fig. 2)
* '''0 whitespace''' between variable/function/declaration and {{{(}}} , e.g.
  * INTEGER, REAL, PARAMETER, etc.
  * READ, WRITE, ALLOCATE, etc.
* '''1 whitespace''' after {{{(}}} and before {{{)}}}\\(only exception: '''0 whitespace''' for brackets containing array indices)

\\\\\\\\\\\\\\\\\\\\\\\\
'''__Whitespaces in DO loops, IF blocks, CASE structures'''
(see Fig. 3)
{{{
#!div style="align:'left'; width: 450px; border: 0px solid; float:right"
[[Image(rules_indent_loops.png, 450px, right, margin-right=2, margin-top=0, border=2)]]\\
'''Fig. 3''' Indentation & whitespaces in loops. Click to enlarge.
}}}
* In general: '''1 whitespace''' everywhere and '''3 whitespace''' for each indentation level
* But: '''2 whitespace'''\\
  * between {{{DO}}} and '''loop index'''
  * after {{{IF (  )}}} --> see {{{CALL}}}, {{{THEN}}}
  * in front of and after logical operators (e.g. {{{.OR.}}}, see [#operators list of allowed operators])


\\\\\\\\\\
=== (3.5.2) Alignment ===
(see Fig. 4)
{{{
#!div style="align:'left'; width: 450px; border: 0px solid; float:right"
[[Image(rules_align.png, 450px, right, margin-right=2, margin-top=0, border=2)]]\\
'''Fig. 4''' Alignment example. Click to enlarge.
}}}
* Block-wise alignment of continuation line mark {{{&}}}
* Alignment of {{{ONLY}}} lists
* At least block-wise alignment for same type/group of declaration statement
* Alignment of {{{message_string}}} values
* Alignment of expressions between brackets (e.g. in {{{IF (  )  THEN}}} or in argument list of subroutine calls)
* Alignment of related code, e.g. in complex equations or setting of initial values for variables

=== (3.5.3) Alphabetical sorting ===
* Members in {{{ONLY}}} lists of USE statements (see e.g. Fig. 4)
* Parameters in Fortran {{{NAMELIST}}} lists (see e.g. initialization_parameters namelist in parin.f90)
* Declaration types (first {{{CHARACTER}}}, then {{{INTEGER}}}, etc., see Fig. 4)
* Variables in declaration statements (see Fig. 4)

=== (3.5.4) Handling of lists of variables ===
* In namelists, one variable per line:
{{{
NAMELIST /abcde/  var_a,  &
                  var_b,  &
                  ...
}}}
* In argument lists, consecutive up to the allowed line width:
{{{
SUBROUTINE sub_a( var_a, var_b, ... )
}}}



\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
== (3.6) Final steps ==
=== Good practice ===
* No warning/error message should remain during compile (also try debug options)

=== Clean up ===
* Remove PRINT/WRITE statements for used for debugging
* Check that all parameters are used
* Check that you have applied all the above listed rules