Navigation :

Batches

In many situations, we need to perform the same operations over many mesh functions, such as the electronic wave functions. It is therefore advantageous to group those functions into one object. This can ensure that different mesh functions are contiguous in memory.

Due to the nature of stencil operations, which constitute a large part of the low level operations on mesh functions, it is often more efficient to perform the same stencil operation over different mesh functions (i.e. using the state index as fast index), than looping first over the mesh index, which would, in general, require a different stencil for each mesh point. This is, in particular, the case for calculations utilizing GPUs.

Therefore, we store mesh functions in linear or in so-called packed form. The former refers to the ’natural’ ordering where the mesh index is the fastest moving, while the latter is transposed.

The abstract class batch_t is the parent class for batches, such as electronic wave functions.

Definition of "batch_t"

  type batch_t
    private
    integer,                     public :: nst   !< number of functions in the batch
    integer,                     public :: dim   !< Spinor dimension of the state (one, or two for spinors)
    integer                             :: np    !< number of points in each function (this can be np or np_part)
    integer                             :: ndims !< The second dimension of ist_idim_index(:,:). Currently always set to 2.
    integer,        allocatable         :: ist_idim_index(:, :)
    !<                                                  @brief index mapping fom global (ist,idim) to local ist.
    !!
    !!                                                  This maps ist and idim into one linear array.
    !!                                                  This index is constructed in batch_oct_m::batch_build_indices
    integer,        allocatable, public :: ist(:)    !< @brief map from an global to local index
    !!
    !!                                                   The global index does not need to start at 1, while
    !!                                                   the local index is always in the range 1:nst.
    !!
    !!                                                   This index is constructed in batch_oct_m::batch_build_indices

    logical                             :: is_allocated  !< indicate allocation status
    logical                             :: own_memory    !< does the batch own the memory or is it foreign memory?
    !  We also need a linear array with the states in order to calculate derivatives, etc.
    integer,                     public :: nst_linear    !< nst_linear = nst * st%d%dim

    integer                             :: status_of     !< @brief packing status of the batch
    !!
    !!                                                   possible values are:
    !!                                                   BATCH_NOT_PACKED, BATCH_PACKED, BATCH_DEVICE_PACKED
    integer                             :: status_host   !< @brief packing status in CPU memory
    !!
    !!                                                      If Octopus runs on GPU, this indicates the status on the CPU.
    !!                                                      It can only be BATCH_NOT_PACKED and BATCH_PACKED.
    !!                                                      This makes transfers more efficient: usually we allocate a
    !!                                                      batch as packed on the CPU, then call do_pack to copy it to the GPU.
    !!                                                      In this case, it is really a copy.
    !!                                                      If the batch is unpacked on the CPU, we need to transpose in
    !!                                                      addition which makes it much slower.
    type(type_t)                        :: type_of             !< either TYPE_FLOAT or TYPE_CMPLX
    integer                             :: device_buffer_count !< keep track of pack operations performed on the device
    integer                             :: host_buffer_count   !< keep track of pack operations performed on the host
    logical                             :: special_memory      !< are we using hardware-aware memory?
    logical                             :: needs_finish_unpack !< if .true., async unpacking has started and needs be finished


    ! unpacked variables; linear variables are pointers with different shapes
    real(real64),    pointer, contiguous,  public :: dff(:, :, :)     !< pointer to real mesh functions:
    !                                                                 !! indices are (1:np, 1:dim, 1:nst)
    complex(real64), pointer, contiguous,  public :: zff(:, :, :)     !< pointer to complex mesh functions:
    !                                                                 !! indices are (1:np, 1:dim, 1:nst)
    real(real64),    pointer, contiguous,  public :: dff_linear(:, :) !< pointer to real mesh functions:
    !                                                                 !! indices are (1:np, 1:nst_linear)
    complex(real64), pointer, contiguous,  public :: zff_linear(:, :) !< pointer to complex mesh functions:
    !                                                                 !! indices are (1:np, 1:nst_linear)

    ! packed variables; only rank-2 arrays due to padding to powers of 2
    real(real64), pointer, contiguous,  public :: dff_pack(:, :)      !< pointer to real mesh functions:
    !                                                                 !! indices are (1:nst_linear, 1:np)
    complex(real64), pointer, contiguous,  public :: zff_pack(:, :)   !< pointer to complex mesh functions:
    !                                                                 !! indices are (1:nst_linear, 1:np)

    integer(int64),                 public :: pack_size(1:2)      !< pack_size = [pad_pow2(nst_linear), np]
    !!                                                            (see math_oct_m::pad_pow2)
    integer(int64),                 public :: pack_size_real(1:2) !< pack_size_real = pack_size;
    !!                                                            if batch type is complex, then
    !!                                                            pack_size_real(1) = 2*pack_size(1)

    type(accel_mem_t),           public :: ff_device           !< pointer to device memory

  contains
    procedure :: check_compatibility_with => batch_check_compatibility_with !< @copydoc batch_oct_m::batch_check_compatibility_with
    procedure :: clone_to => batch_clone_to                                 !< @copydoc batch_oct_m::batch_clone_to
    procedure :: clone_to_array => batch_clone_to_array                     !< @copydoc batch_oct_m::batch_clone_to_array
    procedure :: copy_to => batch_copy_to                                   !< @copydoc batch_oct_m::batch_copy_to
    procedure :: copy_data_to => batch_copy_data_to                         !< @copydoc batch_oct_m::batch_copy_data_to
    procedure :: do_pack_generic => batch_do_pack_generic
    procedure :: do_pack_target => batch_do_pack_target
    generic :: do_pack => do_pack_generic, do_pack_target                   !< @copydoc batch_oct_m::batch_do_pack_generic
    procedure :: do_unpack => batch_do_unpack                               !< @copydoc batch_oct_m::batch_do_unpack
    procedure :: finish_unpack => batch_finish_unpack                       !< @copydoc batch_oct_m::batch_finish_unpack
    procedure :: end => batch_end                                           !< @copydoc batch_oct_m::batch_end
    procedure :: inv_index => batch_inv_index                               !< @copydoc batch_oct_m::batch_inv_index
    procedure :: is_packed => batch_is_packed                               !< @copydoc batch_oct_m::batch_is_packed
    procedure :: ist_idim_to_linear => batch_ist_idim_to_linear             !< @copydoc batch_oct_m::batch_ist_idim_to_linear
    procedure :: linear_to_idim => batch_linear_to_idim                     !< @copydoc batch_oct_m::batch_linear_to_idim
    procedure :: linear_to_ist => batch_linear_to_ist                       !< @copydoc batch_oct_m::batch_linear_to_ist
    procedure :: pack_total_size => batch_pack_total_size                   !< @copydoc batch_oct_m::batch_pack_total_size
    procedure :: remote_access_start => batch_remote_access_start           !< @copydoc batch_oct_m::batch_remote_access_start
    procedure :: remote_access_stop => batch_remote_access_stop             !< @copydoc batch_oct_m::batch_remote_access_stop
    procedure :: status => batch_status                                     !< @copydoc batch_oct_m::batch_status
    procedure :: type => batch_type                                         !< @copydoc batch_oct_m::batch_type
    procedure :: type_as_int => batch_type_as_integer                       !< @copydoc batch_oct_m::batch_type_as_integer
    procedure, private :: dallocate_unpacked_host => dbatch_allocate_unpacked_host
    !< @copydoc batch_oct_m::dbatch_allocate_unpacked_host
    procedure, private :: zallocate_unpacked_host => zbatch_allocate_unpacked_host
    !< @copydoc batch_oct_m::zbatch_allocate_unpacked_host
    procedure, private :: allocate_unpacked_host => batch_allocate_unpacked_host
    !< @copydoc batch_oct_m::batch_allocate_unpacked_host
    procedure, private :: dallocate_packed_host => dbatch_allocate_packed_host
    !< @copydoc batch_oct_m::dbatch_allocate_packed_host
    procedure, private :: zallocate_packed_host => zbatch_allocate_packed_host
    !< @copydoc batch_oct_m::zbatch_allocate_packed_host
    procedure, private :: allocate_packed_host => batch_allocate_packed_host
    !< @copydoc batch_oct_m::batch_allocate_packed_host
    procedure, private :: allocate_packed_device => batch_allocate_packed_device
    !< @copydoc batch_oct_m::batch_allocate_packed_device
    procedure, private :: deallocate_unpacked_host => batch_deallocate_unpacked_host
    !< @copydoc batch_oct_m::batch_deallocate_unpacked_host
    procedure, private :: deallocate_packed_host => batch_deallocate_packed_host
    !< @copydoc batch_oct_m::batch_deallocate_packed_host
    procedure, private :: deallocate_packed_device => batch_deallocate_packed_device
    !< @copydoc batch_oct_m::batch_deallocate_packed_device
  end type batch_t

This class includes information about the dimensions of the functions (number of states, spatial dimension and number of mesh points), but also internal book-keeping variables, keeping track of the status of the batch. Furthermore, the batch_t data type contains pointers to the actual data arrays, and defines the methods for interacting with a batch.

Empty batches can be initialized with:

Initializing empty batches

Initializing batches with memory

Definition of "wfs_elec_t"