Manual:Appendix:Porting Octopus and Platform Specific Instructions
This page contains information about Octopus portability, with specific information to compile and run octopus for many architectures. If you managed to compile octopus for a different system, please contribute. Warning: this information is quite out of date and may no longer be valid.
- 1 General information and tips about compilers
- 2 SSE2 support
- 3 Operating systems
- 4 Compilers
- 4.1 Intel Compiler for x86/x86_64
- 4.2 Intel Compiler for Itanium
- 4.3 Open64
- 4.4 Pathscale Fortran Compiler
- 4.5 NAG compiler
- 4.6 GNU C Compiler (gcc)
- 4.7 GNU Fortran (gfortran)
- 4.8 g95
- 4.9 Portland 6
- 4.10 Portland 7, 8, 9
- 4.11 Portland 10
- 4.12 Portland 11
- 4.13 Portland 12
- 4.14 Absoft
- 4.15 Compaq compiler
- 4.16 Xlf
- 4.17 SGI MIPS
- 4.18 Sun Studio
- 5 MPI Implementations
- 6 NetCDF
- 7 BLAS and LAPACK
General information and tips about compilers
- Octopus is developed in the GNU environment and sometimes we depend on system features that are GNU extensions without knowing it. These are bugs and we will try to fix them; please report any problem that you find.
- If you have problems with the C compiler, try to use gcc. It normally works better than vendor compilers and it's available on most systems. However, be careful with locally installed versions of gcc: sometimes they don't work.
- The Fortran
//concatenation operator is sometimes recognized as a C++-style comment and the preprocessor gives erroneous results: sometimes it doesn't expand macros after it or simply eliminates what comes after. To solve this problem, use the preprocessor with the -C (keep comments) and -ansi or equivalent options (in ANSI C
//is not a comment).
- If you are compiling in dual 32/64-bit architectures like PowerPC, UltraSparc or AMD64 systems here are some tips:
- A 64-bit version of Octopus is only needed if you are going to use more than 2-3 Gb of physical RAM.
- Some operating systems have 64 bits kernels and 32 bits userspace (Solaris, OS X); if you want a 64-bit Octopus there, you have to compile all required libraries in 64-bit (normally a 64-bit libc is available).
- Typically Linux distributions for AMD64 have a 64-bit userspace, so you will get a 64-bit executable there.
- We have some SSE2 code written using compiler primitives that can give an important increase in perfomance. For this you need hardware support (AMD Opteron/Athlon 64/Sempron/Turion or Intel Pentium 4 or newer) and compiler support, supported compilers are GCC and pathcc. For gcc you need to put the correct
-marchflags (for example
- Besides this, for x86 (32 bits) you have to link dynamically because we have to use a tweaked malloc function that doesn't work with static linking. For x86_64 (64 bits) this is not needed.
The main development operating system for Octopus.
Octopus compiles correctly either with sun compilers or gcc/gfortran. By default Solaris doesn't have GNU coreutils, so some test won't run.
Mac OS X
It works. Don't try to compile static binaries, they are not supported by the OS.
Toy operating systems are not supported for the moment, sorry.
Intel Compiler for x86/x86_64
- status: ok
- Version 9 and version 7.1 Build 20040901Z are ok. Versions 8 and 8.1 can be problematic.
- Recommended flags: FCFLAGS="-u -zero -fpp1 -nbs -pc80 -pad -align -unroll -O3 -ip -tpp7 -xW"
- Intel artificially blocks their compilers from using certain optimization in non-Intel processors.
- With Octopus 3.2.0, use of the flags -check all -traceback with ifort 10.1.018 will cause an internal compiler segmentation fault while compiling src/grid/mesh_init.F90.
Intel Compiler for Itanium
- status: ok
- Version: 8.1.033 (older 8 releases and version 9 are reported to cause problems), version 10 works but it is much slower than 8.1.
- Recommended flags:
- FCFLAGS="-O3 -tpp2 -ip -IPF_fp_relaxed -ftz -align all -pad"
- CFLAGS="-O3 -tpp2 -ip -IPF_fp_relaxed -ftz"
This is an open source compiler based on the liberated code of SGI MIPSpro compiler. It is available for x86, x86_64 and Itanium architectures.
- Versions tested: 2.2, 2.3 and 2.5
- Architecure: x86, AMD64
- Recommended flags:
FCFLAGS="-Wall -O3 -march=auto -mcpu=auto -OPT:Ofast -fno-math-errno"
- Everything works.
- It's necessary to compile blas/lapack with the same compiler.
FCFLAGS="-colour -kind=byte -mismatch_all -abi=64 -ieee=full -O4 -Ounroll=4"
- Status: ok.
- Version: gcc version 4.1.1 or newer. (4.1.0 does not work) For the parallel version you need at least gfortran 4.3.
- You may also need to compile blas, lapack and fftw3 using that specific gfortran version.
- Some recommended flags: -march=athlon64 -msse2 -mfpmath=sse -malign-double -funroll-loops -O3
- Status: works
- Tested architectures: x86/Linux, PowerPC/Darwin
- Version: version 4.0.3 (g95 0.91!) May 24 2007
- G95 doesn't recognize the linemarkers created by the preprocessor, so it's necessary to pass the -P flag to cpp.
FC=g95 FCFLAGS="-O3 -funroll-loops -ffast-math" FCCPP="cpp -ansi-P"
There may be problems with versions 0.92 or 0.93, depending on the underlying version of gcc. See G95 for info on building version 0.94 with gcc 4.2.4.
FCFLAGS="-fast -mcmodel=medium -O4"
The following problem with the PGI compiler version 6.0 and MPICH version 1.2.6 on x86_64 has been reported:
The MPI detection during the configure step does not work properly. This may lead to compilation failures on e. g. the file par_vec.F90. This problem is considered a bug in either the PGI compiler or the MPICH implementation. Please apply the following change by hand after running configure:
In the file config.h, replace the line
/* #undef MPI_H */
#define MPI_H 1
and remove the line
#define MPI_MOD 1
Portland 7, 8, 9
Flags (tested on Cray XT4):
FCFLAGS="-O2 -Munroll=c:1 -Mnoframe -Mlre -Mscalarsse -Mcache_align -Mflushz"
The configure script may fail in the part checking for Fortran libraries of mpif90 for autoconf version 2.59 or earlier. The solution is to update autoconf to 2.60 or later, or manually set FCLIBS in the configure command line to remove a spurious apostrophe.
For Octopus 3.2.0, the file src/basic/lookup.F90 is incorrectly optimized yielding many segmentation faults in the testsuite. With PGI 10.5 the optimization flag should be -O2 or less; with PGI 10.8 the optimization flag should be -O1 or less. Note that -fast and -fastsse are between -O2 and -O3. For later versions of Octopus, a PGI pragma compels this file to be -O0 regardless of what is specified in FCFLAGS, so you may safely set FCFLAGS to -fast.
11.4 does not work and will crash with glibc memory corruption errors. 11.7 is fine.
12.5 and 12.6 cannot compile due to an internal compiler errors of this form:
PGF90-S-0000-Internal compiler error. sym_of_ast: unexpected ast 6034 (simul_box.F90: 1103)
12.4 and 12.9 are ok.
FCFLAGS="-O3 -YEXT_NAMES=LCS -YEXT_SFX=_"
FCFLAGS="-O3 -mcmodel=medium -m64 -cpu:host -YEXT_NAMES=LCS -YEXT_SFX=_"
FCFLAGS="-align dcommons -fast -tune host -arch host -noautomatic"
- Status: works
-bmaxdata:0x80000000 -qmaxmem=-1 -qsuffix=f=f90 -Q -O5 -qstrict -qtune=auto -qarch=auto -qhot -qipa
- Because of the exotic mixture of MAC OS and BSD, this system is not very standard. Compiling Octopus can be problematic.
- OS X doesn't support static linking of binaries, so don't try.
-O3 -INLINE -n32 -LANG:recursive=on
You can download this compiler for free, it supports Linux and Solaris over x86, amd64 and sparc. A very fast compiler but quite buggy.
- CFLAGS="-fast -xprefetch -xvector=simd -D__SSE2__"
Octopus uses the Fortran 90 interface of netCDF, this means that it's likely that you will have to compile it using the same compiler you will use to compile Octopus. You can get the sources and follow installation instructions from the NetCDF site.
BLAS and LAPACK
These are standard libraries that provide a series of common vector and matrix operations. Octopus uses as much as possible this libraries. There are several version available depending on your hardware. Around 40% of Octopus execution time is spend in BLAS level 1 routines, so getting a fast implementation for your hardware might be important. On the other hand, Lapack performance is not very important.
This is the AMD Mathematical Library optimized to run in Athlon and Opteron processors. You can get a free copy from http://developer.amd.com/acml.jsp .
Probably the fastest implementation of blas, source code is available and it can be compiled in many architectures.
See https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor for MKL's advice on the proper way to link. Here is an example, in which
--with-lapack is left blank because it is included in
MKL_DIR=/opt/intel/mkl/lib/lintel64 --with-blas="-L$MKL_DIR -Wl,--start-group -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,--end-group -lpthread" --with-blacs="$MKL_DIR/libmkl_blacs_intelmpi_lp64.a" --with-scalapack="$MKL_DIR/libmkl_scalapack_lp64.a"