Running Octopus on Graphical Processing Units (GPUs)
Recent versions of Octopus support execution on graphical processing units (GPUs). In this tutorial we explain how the GPU support works and how it can be used to speed-up calculations.
Octopus GPU support is based on CUDA for Nvidia GPUs and HIP for AMD GPUs.
Note that the code might fall back to CPU operation for unsupported features.
Consideration before using a GPU
Calculations that will be accelerated by GPUs
A GPU is a massively parallel processor, to work efficiently it need large amounts of data to process. This means that using GPUs to simulate small systems, like molecules of less than 20 atoms or low dimensionality systems, will probably be slower than using the CPU.
Not all the calculations you can do with Octopus will be effectively accelerated by GPUs. Essentially ground state calculations with the chebyshev eigensolver (and calculations based on ground state calculations, like geometry optimization) and time propagation with the etrs and aetrs propagators are the simulations that will use the GPU more effectively. For other types of calculations you might see some improvements in perfomance. Moreover, some Octopus features do not support GPUs at all. Currently these are:
-
HGH or relativistic pseudopotentials.
-
Non-collinear spin.
-
Curvilinear coordinates.
This list might not be up to date.
-
Periodic systems are partially supported, but there are limitations which might force the code to run on the CPU.
In these cases Octopus will be silently executed on the CPU.
The block-size
To obtain good performance on the GPU (and CPUs), Octopus uses blocks of states as the basic data object. The size of these blocks is controlled by the StatesBlockSize variable. For GPUs the value must be a power of 2. The default is the warp size, which is usually 32, for some AMD GPUs it is 64.
Building with GPU support
Please check the cmake documentation for more details. Basically, you need to specify -DOCTOPUS_CUDA=ON for CUDA support and -DOCTOPUS_HIP=ON for HIP support on the cmake command line when configuring. In addition, you need to make sure that cmake finds CUDA or HIP, respectively.