Developers:Coding standards

From OctopusWiki
Jump to: navigation, search

In all contributions to the code, please follow the spirit of these guidelines as much as possible, and try to rectify violations you come across. If you think there is a good reason to have an exception to these rules in some specific part of the code, please discuss it with the other developers first.


Guidelines for Committing

  1. Do not commit a line longer than 132 characters, which may cause trouble for some compilers. This constraint is automatically enforced by an svn pre-commit hook. Also no line after pre-processing may exceed that length. SAFE_ALLOCATE is the typical problem in this regard. Use shorter variables for the dimensions if the line is too long.
  2. Always check over your changes with svn status and svn diff to be sure that you really mean to change what is there. Avoid accidentally committing changes. Remember that you can commit (or diff) files or directories individually, in case you have changes in your working directory which you do not wish to commit.
  3. Be descriptive and verbose in your commit messages. Ideally, for all but the most trivial changes, there should be a paragraph explaining the goal of the changes and how they accomplish it. Follow the style of other commits. This allows others to follow along with what you are up to and help find any problems, and indeed will be helpful for you looking back at your changes in a few months. Make comments covering all of the changes in your commit. Use formatting that will help your message be read in Trac, such as {{{ }}} for pre-formatted code blocks, rXXX or [XXX] to refer to svn revision #XXX, etc. (http://trac.edgewall.org/wiki/WikiFormatting).
  4. Correct spelling is required!
  5. Make many small independent commits rather than a smaller number of large ones, which makes it easier for others to see what you are doing and what the origin of any problem that arises might be (especially with the buildbot and occasionally manually searching for the revision where a certain problem arose). If you find yourself making extensive changes about different issues that will result in a large and complicated commit, stop and make changes about just one of the issues in a separate checked version, and commit that, before continuing with the other issues.
  6. Retain the history of files as much as possible in the svn repository: to rename a file, use svn mv rather than svn add and svn rm; to split a file or use one as a template for a new one, use svn cp.
  7. Copy or repeat code as little as possible as it makes the code harder to understand and maintain. You can generally reuse the same code with if conditions or pulling out the part you want to use again as a subroutine.
  8. Don't change indentation at the same time that you make meaningful changes, or else it difficult to see what meaningful changes were actually made. Use two separate commits for clarity.
  9. If you find your local version to be in conflict, be very careful about resolving it to be sure that you are not undoing other recent work in the repository. If you have trouble or are unsure how to proceed, ask for help.
  10. If you introduce verbatim code from some external source, document carefully in the code itself and in your commit message where it came from and what its licensing provisions are (and how they are compatible with our GPL v2 license -- https://www.gnu.org/philosophy/license-list.html#GPLCompatibleLicenses). If you import a whole directory, put it in external_libs, and document the origin and license in a README.octopus. Remove any svn 'Id:' lines in the imported code that refer to its repository not ours.
  11. If you make a significant change to a file, make sure the copyright notice at the beginning of the file is up to date (i.e., it includes your name and the name of any other person that contributed to the changes you are commiting).
  12. If you find that your commit has caused a problem in compilation or test failures that you cannot solve promptly, revert it, via svn merge -c -REV . (where REV is the offending revision number). Find the solution and then commit the corrected version of your change later.
  13. Files that are generated by compilation and installation should be listed in svn:ignore so that they do not show up with svn status. Set this on the containing directory via svn propset svn:ignore DIRNAME.
  14. When you commit a bugfix, clearly identify it by beginning the commit message with "Bugfix:". State when the bug was introduced (if you are able to determine that) and what situations it would affect, and how your change rectifies the situation. Whenever possible, make sure that there is a test or test match that ensures the correct behavior.
  15. Backport bugfixes to the current release branch (e.g. 5.0.x). You can use scripts/backport.sh for this purpose, from a checked out copy of the release branch. Wait for the buildslaves to test the code before you backport to make sure you are not introducing problems into the release branch, which should compile and pass the tests on every revision. If the backport cannot apply to a file due to its having been renamed, you can download the changeset as a diff from Trac and try manually editing and applying with patch.
  16. Remember to add new files to the relevant Makefile.am, including testsuite .test and .inp files.
  17. Pay attention to compiler warnings, at compile time and run time, on your own machine and from the build slaves, and try to fix them when possible: they often reveal a latent problem, even if tests are not failing.
  18. When you implement a method or algorithm, be sure to provide a citation in a comment (to papers, websites, etc.) and in the variable documentation when possible.
  19. You don't necessary have to cover every possible situation when you write some new code, but you must note and enforce whatever assumptions you make, so it is clear to others (and yourself next year!) what cases are not implemented.

Special Octopus infrastructure

When you add new input variables, be sure to write a variable description.

  • You can use HTML formatting such as <i>, <tt>, <a href=...>.
  • Enclose LaTeX math in <math> .. </math> tags, which will be parsed by MathJax (http://docs.mathjax.org/en/latest/tex.html).
  • Specify a Section (and subsection(s) if appropriate) to organize the variable list.
  • Types: float, integer, flag (means binary bits summed), string, logical, block.
  • Listing Option x automatically creates an integer constant OPTION_X usable in Fortran with the corresponding value.
  • For a flag type, the option values can be specified as bit(0), bit(1), etc. to generate the series of powers of 2.
  • List the default in the Default field if it can be stated simply; otherwise, describe it in the Description block.
  • Synonyms: an option without a corresponding description will not appear in the variable reference, but will still be understood by the parser. This can be used to leave an obsolete option for backward compatibility.

General form:

   !%Variable NAME
   !%Type TYPE
   !%Default VALUE
   !%Section SECTION[::SUBSECTION]
   !%Description
   !% DESCRIPTION TEXT
   !%Option OPTION_NAME [OPTION_VALUE]
   !% OPTION_DESCRIPTION
   !%Option OPTION_NAME2 [OPTION_VALUE2]
   !% OPTION_DESCRIPTION2
   !%End

Example, with following call to parser:

   !%Variable ExcessCharge
   !%Type float
   !%Default 0.0
   !%Section States
   !%Description
   !% The net charge of the system. A negative value means that we are adding
   !% electrons, while a positive value means we are taking electrons
   !% from the system.
   !%End
   call parse_variable('ExcessCharge', M_ZERO, excess_charge)
  1. References should be in APS style, including formatting tags: Author1, Author2, et al., J. Short Title Vol, page (year).
  2. Parts of the code that are new and not fully tested, or old and known to have problems, should be marked with calls to messages_experimental, and there should also be a corresponding comment in the variable documentation to identify what is experimental for the users. Users will be pointed to the page Experimental Features by resulting warnings.
  3. If you remove a feature or any significant block of meaningful code, list that fact here on the wiki: Developers:Removed Features
  4. If you make a big change regarding backwards compatibility, list that here on the wiki: Developers:Big changes
  5. Character variables representing a filename should be declared with length MAX_PATH_LEN.
  6. If you rename, remove, or replace a variable, add a call to messages_obsolete_variable to help users with migration.
  7. Use constant MAX_DIM in static allocation rather than assuming a maximum dimensionality. However SAFE_ALLOCATE should always use to the actual dimension.
  8. All routines, functions, interfaces, constants should be in a module.

Unless otherwise specified, the rules below apply to Fortran source. Some can be applied to C source as well.

Required Elements

  1. Every subroutine, function, module, and program must begin with implicit none and private (for default scope). Module routines should not use any modules or contain implicit none, which should be at module scope instead. Modules containing only public parameters, types, and/or interfaces may have default public scope. Modules should list what ought to be publicly accessible via public ::, but should not list anything explicitly as private since this should be the default. All interfaces should contain implicit none.
  2. All source files must have the line #include "global.h", and include the module global_m.
  3. Every subroutine and function must have PUSH_SUB(routine_name) as the first executable line (i.e. the earliest position that will compile, after all the variable declarations) and have POP_SUB(routine_name) as the last line before any return statements. A subroutine contained within another should use this style: PUSH_SUB(routine_name.contained_name). If there are multiple possible exit points from a routine, each must have a POP_SUB. Exceptions:
    1. the push_sub and pop_sub routines themselves and any routines they use. Add the comment
       ! this is called by push/pop so there cannot be a push/pop in this routine
    2. any routine called before the debug level has been set;
    3. certain routines that are called so many times as to swamp the debugging output. Add the comment
       ! no push/pop since called too frequently
    4. elemental or pure routines, where addition of PUSH_SUB/POP_SUB will not compile;
    5. routines that are supposed to be threadsafe -- add the comment
       ! no push_sub, threadsafe
  4. Do not use return for subroutines that just end at the end of the code.
  5. Each file should begin with a license header of this form:
     !! Copyright (C) 2004-2012 Xavier Andrade, Eugene S. Kadantsev (ekadants@mjs1.phy.queensu.ca), David Strubbe
     !!
     !! This program is free software; you can redistribute it and/or modify
     !! it under the terms of the GNU General Public License as published by
     !! the Free Software Foundation; either version 2, or (at your option)
     !! any later version.
     !!
     !! This program is distributed in the hope that it will be useful,
     !! but WITHOUT ANY WARRANTY; without even the implied warranty of
     !! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     !! GNU General Public License for more details.
     !!
     !! You should have received a copy of the GNU General Public License
     !! along with this program; if not, write to the Free Software
     !! Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
     !! 02110-1301, USA.
     !!
     !! $Id$
    

    When you add the file, set the 'Id' SVN property via svn propset svn:keywords 'Id' <filename>, which will make the last line get resolved by SVN to something like:

     !! $Id: sternheimer.F90 12874 2015-02-06 22:50:29Z xavier $
  6. Every .F90 file should end with these comments that set the emacs mode:
     !! Local Variables:
     !! mode: f90
     !! coding: utf-8
     !! End:
    

Program Structure

  1. Octopus is written using the Fortran 95 standard. Fortran 2003/2008 features are allowed, provided that they are supported by all the compilers being used in the test farm.
  2. We use exactly two spaces for indentation, for the body of blocks such as if and do, and for continuation lines. Do not use tabs anywhere in source code. The largest block in the file (typically module, program, or subroutine) should begin in column zero. Preprocessor directives (e.g. #ifdef, #include) should be in column zero. The keyword contains should be at the same indentation as the beginning and end of the program unit. (See "Editor Configuration" below for help with this.)
  3. All named program blocks (e.g. program, subroutine, function, module, interface) should repeat the name in the end statement. e.g. module basic_oct_m ... end module basic_oct_m.
  4. Intrinsics and keywords should be written in lower case. Preprocessor macros and parameters should be written in all upper case.
  5. Write end do and end if not enddo and endif.
  6. Single if clauses are preferred, but only if they fit in a single line. If more than one line is required, use the <it>if</tt>, then, end if form. Using a continuation mark & between the condition and the statement is not allowed as this is misleading and can lead to errors when somebody assumes that the indentation keeps a statement inside the if, as it is done in some scripting languages.

Names

  1. No variable name may be a single letter, since this makes it very hard to search for where it is used or modified. See https://github.com/jornada/DOENDO if you need a quick tool to rename single-letter variables.
  2. Names of variables, arguments, functions, routines, modules, and programs should be descriptive and comprehensible, and if appropriate, comments should explain what they are.
  3. Do not use the same name for different entities.
  4. Files that are included must end with _inc.

Data Structures

  1. Types are named with the suffix _t.
  2. If it is possible and doesn't degrade performance, declare types as private and define access functions.
  3. Make _init and _end routines to allocate and initialize, and deallocate and clean up types.
  4. Do not write default initialization in the declaration of variables part of a module or type.

Declarations

  1. All subroutines and functions must list the argument types in the same order as in the call, using one line per argument, and declare an intent for each (in whenever possible). A blank line should separate the arguments from any subsequent declarations of local variables.
  2. Variable and argument declarations should use the separator ::, as in integer :: ii; they should all be aligned.
  3. Subroutines that follow a contains statement in a module or program should not specify the upper bound of array arguments, e.g. write gvec(:,:) !< (3, ng) in preference to gvec(3, ng). This avoids possible creation of array temporaries (wasting time and memory copying), and allows the array passed to the routine to have a larger leading dimension. However, if the lower bound is not 1, it must be specified explicitly.
  4. Calls that pass optional arguments should call them by keyword, i.e. only_root_writes = .true. which helps identify them as optional. A subroutine that has optional arguments must check for their presence before using them. Use of the functions optional_default is encouraged.
  5. Use Doxygen tags so that variable names and subroutine descriptions are automatically documented. See Developers:Doxygen_documentation.
  6. Use heap not stack for variables of non-fixed size, i.e. dynamic allocation with SAFE_ALLOCATE.
  7. Declare variables in double precision generally. Use single precision only in special circumstances.
  8. Variables must be initialized to zero after declaration or allocation if required; do not rely on the compiler for this.
  9. Use parameter variables in Fortran (or const in C) rather than preprocessor symbols wherever possible.
  10. Do not declare variables as pointer unless it is necessary.

Modules

  1. All subroutines must be in a module.
  2. An F90 file should have the same name as the module it contains.
  3. Every module must be named with the suffix _oct_m, e.g. global_oct_m, and be in a file without that suffix, e.g. global.f90. This naming scheme helps to keep the namespaces for modules, types, routines, etc. clearly separated.
  4. Use of only list for module usage is highly discouraged. Use of the functionality to change the names in an only list is prohibited, except for when handling a module from an external library in which case it is merely highly discouraged: change the name in our code instead.
  5. Module use statements should be in alphabetical order.
  6. Routines from external libraries should have interfaces defined in a module (ideally one provided by the library itself). This module must be used in any routines making the library calls.
  7. MPI calls from Fortran should use the global integer mpierr from module mpi_m for the (basically useless) last argument, rather than declaring its own local integer.
  8. All modules must have private as default scope. The only allowed exception are modules that provide an interface to an external library, in this case the module cannot use other modules that are already used somewhere else in Octopus.
  9. Modules can only declare as public objects that are defined or declared in that module.

Data types

  1. Use the constants defined in src/basic/global.F90 such as M_ONE and M_zI, or the macro CNST (e.g. CNST(0.1)), to make sure that constants are of the correct type.
  2. Use the variable types and other macros defined in src/include/real.F90 and complex.F90 for templated routines, and in preference to adding #ifdef if possible.
  3. Handle units of measurement through the unit_m module, which provides routines for conversion and output, and can be used with the parse_variable routine too for conversion of input quantities.

Output

  1. Correct spelling is required -- one would hope this went without saying!
  2. Warnings should be handled with call messages_warning.
  3. Errors should be handled with call messages_fatal. Do not use stop or exit.
  4. Use write rather than print, and supply a format rather than using *. In write statements, do not break up string literals with continuation lines, so that the text can be grep'd for in the code.

Comments

  1. Comments should be written in English.
  2. Correct spelling is required!
  3. Be sure to write explicit comments explaining the purpose and function of any unusual constructs introduced to the code, especially if they could look wrong or useless to others.
  4. If you comment out a piece of code, put a clear explanation about why it is commented out.
  5. To avoid preprocessor warnings (and XLF compilation failures), an odd number of single (i.e. apostrophes) or double quotes may not be used in comments (use ` instead).
  6. Do not write ???: it generates a warning from cpp about "trigraphs".
  7. Do not mark blank lines with !.

Required Macros and Routines

  1. The intrinsics allocate and deallocate should never be used. Instead use the macros SAFE_ALLOCATE (including an explicit lower dimension so that our profiling is able to estimate the memory usage correctly) and SAFE_DEALLOCATE_A (for arrays) and SAFE_DEALLOCATE_P (for pointers). Danger: DO NOT use these macros in a one-line if statement; they must be used with if(...) then or else some of the generated lines are outside the condition.
  2. Use routine io_open instead of the intrinsic open when opening file units, and io_close instead of close, which will check that the operation occurs properly and write an error if not.

Forbidden

  1. Use of the goto and continue keywords, as well as the related err and end parameters of close, open, and read intrinsic functions, is forbidden. Use the cycle and exit keywords and the iostat parameter instead. Keywords sequence and common are also forbidden.
  2. Do not declare anything external. Use modules for functions and subroutines defined in this package, and interfaces for intrinsics and libraries.
  3. Do not assume any value for a loop counter outside its loop. This works in C, but not necessarily in Fortran. You must save the value in another variable.
  4. Continuation lines shall not begin with &.

Portability

  1. All code must work correctly at -O3 with gcc, PGI, and Intel compilers. If it does not, it likely contains an error even if it seems to work without optimization. Making sure it works at -O3 with other compilers too is strongly encouraged.
  2. Do not make explicit reference in source files to the compiler being used (except, possibly, for disabling optimization with a pragma).

Performance

  1. Always use BLAS if there is a suitable routine. But don't use BLAS if a vectorial expression or a single loop has to be replaced by several blas calls.

Scripts

  1. All scripts shall contain a "shebang" in the first line, e.g. /usr/bin/env bash, /usr/bin/env python, etc.

C code

  1. C code must be C99 compliant.
  2. You can use the inline and restrict modifiers; autoconf handles this in case they are not supported by the compiler.
  3. Declare non-modified arguments as const.

C++ code

  1. Only subset of C++ 11 is allowed, as long as it compiles with the IBM compiler. So avoid C++ 11 features, if possible.
  2. The use of header only code is encouraged.

Makefiles

  1. Lists of source files should be sorted alphabetically.
  2. An object file (.o.) should have dependencies on the main F90 source file it comes from, any F90 source files included by that one, and any object files in the same directory whose modules it uses. We recomend these dependencies be generated using make depend, which will call the makedepf90 tool.
  3. All parts of the code must build correctly with parallel make (make -j) with any number of threads. If that fails, there is a mistake, usually missing dependencies.
  4. Every target that is not the name of a file should be in the .PHONY list.

Other

  1. Use spaces around operators like =, +, etc.
  2. Do not declare variables as pointer unless really necessary.
  3. Test files (.test) should start with this line to enable proper syntax highlighting:
-*- coding: utf-8 mode: shell-script -*-
  1. Vector expressions should specify explicit bounds.
  2. (put my scripting stuff on wiki) -- perl.pl etc.
  3. rules for m4
  4. Do not put profiling calls inside OpenMP regions.

Editor Configuration

To configure your editor to use our indentation convention:

  • emacs : in menus, select Options->Advanced(Customize)->Emacs->Programming->Language. Click on F90, then on Indent. Now set all the numeric values here to 2, and Save and Done. Equivalently, paste the following lines to your ~/.emacs file:
(custom-set-variables
 '(f90-continuation-indent 2)
 '(f90-do-indent 2)
 '(f90-if-indent 2)
 '(f90-type-indent 2)
 '(vc-follow-symlinks nil))
(setq c-default-style "linux"
      c-basic-offset 2)

The last line also suppresses the warning emacs will give you when you open a symbolically linked file.

  • vi : paste the following lines to your ~/.vimrc file:
:let fortran_free_source=1
au BufNewFile,BufRead *.f90p,*.fp set filetype=fortran
au filetype fortran set expandtab softtabstop=2 shiftwidth=2 autoindent
set backspace=2

References