This module implements batches of mesh functions. More...

Detailed Description

This module implements batches of mesh functions.

In many situations, we need to perform the same operations over many mesh functions, such as the electronic wave functions. It is therefore advantageous to group those functions into one object. This can ensure that different mesh functions are contiguous in memory.

Due to the nature of stencil operations, which constitute a large part of the low level operations on mesh functions, it is often more efficient to perform the same stencil operation over different mesh functions (i.e. using the state index as fast index), than looping first over the mesh index, which would, in general, require a different stencil for each mesh point. This is, in particular, the case for calculations utilizing GPUs.

Therefore, we store mesh functions in linear or in so-called packed form. The former refers to the natural ordering where the mesh index is the fastest moving, while the latter is transposed. Furthermore the arrays are padded to ensure aligned memory access.

The packed form is even more advantageous on the GPU. Therefore, only packed data is stored in the device memory. On the devices, the padding is aligned with the size of a work group and can depend on the actual device.

Data Types
interface	batch_init
	initialize a batch with existing memory More...

type	batch_t
	Class defining batches of mesh functions. More...

Functions/Subroutines
subroutine	batch_end (this, copy)
	finalize a batch and release allocated memory, if necessary More...

subroutine	batch_deallocate_unpacked_host (this)
	release unpacked host memory More...

subroutine	batch_deallocate_packed_host (this)
	release packed host memory More...

subroutine	batch_deallocate_packed_device (this)
	release packed device memory More...

subroutine	batch_allocate_unpacked_host (this)
	allocate host (CPU) memory for unpacked data More...

subroutine	batch_allocate_packed_host (this)
	allocate host (CPU) memory for packed data More...

subroutine	batch_allocate_packed_device (this)
	allocate device (GPU) memory for packed data More...

subroutine	batch_init_empty (this, dim, nst, np)
	initialize an empty batch More...

subroutine	batch_clone_to (this, dest, pack, copy_data, new_np)
	clone a batch to a new batch More...

subroutine	batch_clone_to_array (this, dest, n_batches, pack, copy_data)

subroutine	batch_copy_to (this, dest, pack, copy_data, new_np, special)
	make a copy of a batch More...

type(type_t) pure function	batch_type (this)
	return the type of a batch More...

integer pure function	batch_type_as_integer (this)
	For debuging purpose only. More...

integer pure function	batch_status (this)
	return the status of a batch More...

logical pure function	batch_is_packed (this)

integer(int64) function	batch_pack_total_size (this)

subroutine	batch_do_pack (this, copy, async)
	pack the data in a batch More...

subroutine	batch_do_unpack (this, copy, force, async)
	unpack a batch More...

subroutine	batch_finish_unpack (this)
	finish the unpacking if do_unpack() was called with async=.true. More...

subroutine	batch_write_unpacked_to_device (this)

subroutine	batch_read_device_to_unpacked (this)

subroutine	batch_write_packed_to_device (this, async)

subroutine, public	batch_read_device_to_packed (this, async)

integer function	batch_inv_index (this, cind)
	inverse index lookup More...

integer pure function	batch_ist_idim_to_linear (this, cind)
	direct index lookup More...

integer pure function	batch_linear_to_ist (this, linear_index)
	get state index ist from linear (combined dim and nst) index More...

integer pure function	batch_linear_to_idim (this, linear_index)
	extract idim from linear index More...

subroutine	batch_remote_access_start (this, mpi_grp, rma_win)
	start remote access to a batch on another node More...

subroutine	batch_remote_access_stop (this, rma_win)
	stop the remote access to the batch More...

subroutine	batch_copy_data_to (this, np, dest, async)
	copy data to another batch. More...

subroutine	batch_check_compatibility_with (this, target, only_check_dim)
	check whether two batches have compatible dimensions (and type) More...

subroutine	batch_build_indices (this, st_start, st_end)
	build the index ist(:) and ist_idim_index(:,:) and set pack_size More...

subroutine	dbatch_init_with_memory_3 (this, dim, st_start, st_end, psi)
	initialize a batch with an rank-3 array of TYPE_FLOAT valued mesh functions psi. More...

subroutine	dbatch_init_with_memory_2 (this, dim, st_start, st_end, psi)
	initialize a batch with an rank-2 array of TYPE_FLOAT valued mesh functions psi. More...

subroutine	dbatch_init_with_memory_1 (this, psi)
	initialize a batch with an rank-1 array of TYPE_FLOAT valued mesh functions psi. More...

subroutine	dbatch_allocate_unpacked_host (this)
	allocate host (CPU) memory for unpacked data of type TYPE_FLOAT More...

subroutine	dbatch_allocate_packed_host (this)
	allocate host (CPU) memory for packed data of type TYPE_FLOAT More...

subroutine, public	dbatch_init (this, dim, st_start, st_end, np, special, packed)
	initialize a TYPE_FLOAT valued batch to given size without providing external memory More...

subroutine	dbatch_pack_copy (this)
	copy data from the unpacked to the packed arrays More...

subroutine	dbatch_unpack_copy (this)
	copy data from the packed to the unpacked arrays More...

subroutine	zbatch_init_with_memory_3 (this, dim, st_start, st_end, psi)
	initialize a batch with an rank-3 array of TYPE_CMPLX valued mesh functions psi. More...

subroutine	zbatch_init_with_memory_2 (this, dim, st_start, st_end, psi)
	initialize a batch with an rank-2 array of TYPE_CMPLX valued mesh functions psi. More...

subroutine	zbatch_init_with_memory_1 (this, psi)
	initialize a batch with an rank-1 array of TYPE_CMPLX valued mesh functions psi. More...

subroutine	zbatch_allocate_unpacked_host (this)
	allocate host (CPU) memory for unpacked data of type TYPE_CMPLX More...

subroutine	zbatch_allocate_packed_host (this)
	allocate host (CPU) memory for packed data of type TYPE_CMPLX More...

subroutine, public	zbatch_init (this, dim, st_start, st_end, np, special, packed)
	initialize a TYPE_CMPLX valued batch to given size without providing external memory More...

subroutine	zbatch_pack_copy (this)
	copy data from the unpacked to the packed arrays More...

subroutine	zbatch_unpack_copy (this)
	copy data from the packed to the unpacked arrays More...

Variables
integer, parameter, public	batch_not_packed = 0
	functions are stored in CPU memory, unpacked order More...

integer, parameter, public	batch_packed = 1
	functions are stored in CPU memory, in transposed (packed) order More...

integer, parameter, public	batch_device_packed = 2
	functions are stored in device memory in packed order More...

integer, parameter	cl_pack_max_buffer_size = 4
	this value controls the size (in number of wave-functions) of the buffer used to copy states to the opencl device. More...

Function/Subroutine Documentation

◆ batch_end()

subroutine batch_oct_m::batch_end	(	class(batch_t), intent(inout)	this,
		logical, intent(in), optional	copy
	)

private

finalize a batch and release allocated memory, if necessary

If the batch was initialized with 'external' memory, this routine ensures that this memory is up-to-date, when the batch is finalized. This means, that the data is copied from the device (If requested) and unpacked.

Parameters

[in] copy do we need to copy data from the device? Default = .true. (from batch_oct_m::batch_do_uppack)

Definition at line 294 of file batch.F90.

◆ batch_deallocate_unpacked_host()

subroutine batch_oct_m::batch_deallocate_unpacked_host ( class(batch_t), intent(inout) this )

private

release unpacked host memory

This routine takes care of special (i.e. pinned memory)

Definition at line 332 of file batch.F90.

◆ batch_deallocate_packed_host()

subroutine batch_oct_m::batch_deallocate_packed_host ( class(batch_t), intent(inout) this )

private

release packed host memory

This routine takes care of special (i.e. pinned memory)

Definition at line 363 of file batch.F90.

◆ batch_deallocate_packed_device()

subroutine batch_oct_m::batch_deallocate_packed_device ( class(batch_t), intent(inout) this )

private

release packed device memory

Definition at line 388 of file batch.F90.

◆ batch_allocate_unpacked_host()

subroutine batch_oct_m::batch_allocate_unpacked_host ( class(batch_t), intent(inout) this )

private

allocate host (CPU) memory for unpacked data

This routine is a wrapper to the tyupe specific versions

Definition at line 403 of file batch.F90.

◆ batch_allocate_packed_host()

subroutine batch_oct_m::batch_allocate_packed_host ( class(batch_t), intent(inout) this )

private

allocate host (CPU) memory for packed data

This routine is a wrapper to the tyupe specific versions

Definition at line 422 of file batch.F90.

◆ batch_allocate_packed_device()

subroutine batch_oct_m::batch_allocate_packed_device ( class(batch_t), intent(inout) this )

private

allocate device (GPU) memory for packed data

This routine is a wrapper to the tyupe specific versions

Definition at line 441 of file batch.F90.

◆ batch_init_empty()

subroutine batch_oct_m::batch_init_empty	(	type(batch_t), intent(out)	this,
		integer, intent(in)	dim,
		integer, intent(in)	nst,
		integer, intent(in)	np
	)

private

initialize an empty batch

This auxilliary function is only called from batch_oct_m::batch_init functions. It initializes the book-keeping parameters, allocates memory for the indices and nullifies the pointers to the data arrays.

Parameters

[out]	this	the batch to be initialized
[in]	dim	The number of spin dimensions
[in]	nst	The number of states in the batch
[in]	np	The number of points in each mesh function

Definition at line 459 of file batch.F90.

◆ batch_clone_to()

subroutine batch_oct_m::batch_clone_to	(	class(batch_t), intent(in)	this,
		class(batch_t), intent(out), allocatable	dest,
		logical, intent(in), optional	pack,
		logical, intent(in), optional	copy_data,
		integer, intent(in), optional	new_np
	)

private

clone a batch to a new batch

This routine clones the metadata of a batch and, if requested copies the data.

Parameters

[in]	this	source batch
[out]	dest	destination batch
[in]	pack	If .false. the new batch will not be packed. Default: batch_is_packed(this)
[in]	copy_data	If .true. the batch data will be copied to the destination batch. Default: .false.
[in]	new_np	If present, this replaces thisnp in the initialization

Definition at line 499 of file batch.F90.

◆ batch_clone_to_array()

subroutine batch_oct_m::batch_clone_to_array	(	class(batch_t), intent(in)	this,
		class(batch_t), dimension(:), intent(out), allocatable	dest,
		integer, intent(in)	n_batches,
		logical, intent(in), optional	pack,
		logical, intent(in), optional	copy_data
	)

private

Parameters

[in]	pack	If .false. the new batch will not be packed. Default: batch_is_packed(this)
[in]	copy_data	If .true. the batch data will be copied to the destination batch. Default: .false.

Definition at line 524 of file batch.F90.

◆ batch_copy_to()

subroutine batch_oct_m::batch_copy_to	(	class(batch_t), intent(in)	this,
		class(batch_t), intent(out)	dest,
		logical, intent(in), optional	pack,
		logical, intent(in), optional	copy_data,
		integer, intent(in), optional	new_np,
		logical, intent(in), optional	special
	)

private

make a copy of a batch

This routine can perform a deep or a shallow copy of a batch

Parameters

[in]	this	The source batch
[out]	dest	The destination batch
[in]	pack	If .false. the new batch will not be packed. Default: batch_is_packed(this)
[in]	copy_data	If .true. the batch data will be copied to the destination batch. Default: .false.
[in]	new_np	If present, this replaces thisnp in the initialization
[in]	special	If present, this replace special in the locic below, i.e., we try to allocate on the GPU

Definition at line 556 of file batch.F90.

◆ batch_type()

type(type_t) pure function batch_oct_m::batch_type ( class(batch_t), intent(in) this )

private

return the type of a batch

This function is THREADSAFE

Definition at line 611 of file batch.F90.

◆ batch_type_as_integer()

integer pure function batch_oct_m::batch_type_as_integer ( class(batch_t), intent(in) this )

private

For debuging purpose only.

Definition at line 620 of file batch.F90.

◆ batch_status()

integer pure function batch_oct_m::batch_status ( class(batch_t), intent(in) this )

private

return the status of a batch

This function is THREADSAFE

Definition at line 637 of file batch.F90.

◆ batch_is_packed()

logical pure function batch_oct_m::batch_is_packed ( class(batch_t), intent(in) this )

private

Definition at line 645 of file batch.F90.

◆ batch_pack_total_size()

integer(int64) function batch_oct_m::batch_pack_total_size ( class(batch_t), intent(inout) this )

private

Definition at line 653 of file batch.F90.

◆ batch_do_pack()

subroutine batch_oct_m::batch_do_pack	(	class(batch_t), intent(inout)	this,
		logical, intent(in), optional	copy,
		logical, intent(in), optional	async
	)

private

pack the data in a batch

If accelerators are enabled, the packed data is moved to the device memory. If the batch is already packed, a counter is increased to keep track when to unpack.

Parameters

[in,out]	this	The current batch
[in]	copy	Do we copy the data to the packed memory? (default .true.)
[in]	async	We can do an asynchronous operation. (default .false.) The program flow can continue while data is being transferred to the device.

Definition at line 669 of file batch.F90.

◆ batch_do_unpack()

subroutine batch_oct_m::batch_do_unpack	(	class(batch_t), intent(inout)	this,
		logical, intent(in), optional	copy,
		logical, intent(in), optional	force,
		logical, intent(in), optional	async
	)

private

unpack a batch

We unpack the batch if the 'packing counter' is one, or the force flag is given.

Parameters

[in]	copy	indicate whether to copy the data (default .true.)
[in]	force	if force = .true., unpack independently of the counter (default .false.)
[in]	async	indicate whether the operation can by asynchronous (default .false.). In this case the operation has to be completed by calling batch_finish_unpack()

Definition at line 748 of file batch.F90.

◆ batch_finish_unpack()

subroutine batch_oct_m::batch_finish_unpack ( class(batch_t), intent(inout) this )

private

finish the unpacking if do_unpack() was called with async=.true.

Definition at line 830 of file batch.F90.

◆ batch_write_unpacked_to_device()

subroutine batch_oct_m::batch_write_unpacked_to_device ( class(batch_t), intent(inout) this )

private

Definition at line 844 of file batch.F90.

◆ batch_read_device_to_unpacked()

subroutine batch_oct_m::batch_read_device_to_unpacked ( class(batch_t), intent(inout) this )

private

Definition at line 923 of file batch.F90.

◆ batch_write_packed_to_device()

subroutine batch_oct_m::batch_write_packed_to_device	(	class(batch_t), intent(inout)	this,
		logical, intent(in), optional	async
	)

private

Definition at line 995 of file batch.F90.

◆ batch_read_device_to_packed()

subroutine, public batch_oct_m::batch_read_device_to_packed	(	class(batch_t), intent(inout)	this,
		logical, intent(in), optional	async
	)

Definition at line 1014 of file batch.F90.

◆ batch_inv_index()

integer function batch_oct_m::batch_inv_index	(	class(batch_t), intent(in)	this,
		integer, dimension(:), intent(in)	cind
	)

private

inverse index lookup

This function returns the linear index for (ist, idim), where ist ranges from 1 to stnst.

Parameters

[in]	this	the batch
[in]	cind	combined index (ist, idim)

Definition at line 1037 of file batch.F90.

◆ batch_ist_idim_to_linear()

integer pure function batch_oct_m::batch_ist_idim_to_linear	(	class(batch_t), intent(in)	this,
		integer, dimension(:), intent(in)	cind
	)

private

direct index lookup

This function returns the linear index for (ist, idim), where ist ranges from 1 to thisnst.

Parameters

[in]	this	the batch
[in]	cind	combined index (ist, idim)

Definition at line 1054 of file batch.F90.

◆ batch_linear_to_ist()

integer pure function batch_oct_m::batch_linear_to_ist	(	class(batch_t), intent(in)	this,
		integer, intent(in)	linear_index
	)

private

get state index ist from linear (combined dim and nst) index

The linear index interleaves the state index with the dimension, resulting in a one-dimensional ordering of states.

Definition at line 1072 of file batch.F90.

◆ batch_linear_to_idim()

integer pure function batch_oct_m::batch_linear_to_idim	(	class(batch_t), intent(in)	this,
		integer, intent(in)	linear_index
	)

private

extract idim from linear index

Definition at line 1083 of file batch.F90.

◆ batch_remote_access_start()

subroutine batch_oct_m::batch_remote_access_start	(	class(batch_t), intent(inout)	this,
		type(mpi_grp_t), intent(in)	mpi_grp,
		type(mpi_win), intent(out)	rma_win
	)

private

start remote access to a batch on another node

This routine creates a remote access window for a given batch and returns a handle to that window. A handle of -1 indicates that no window was created.

Note: this is currently not allowed when using GPUs; side effect: the packing of the batch is increased by one

Parameters

[in,out]	this	the current batch
[in]	mpi_grp	the MPI group
[out]	rma_win	handle of rma window

Definition at line 1101 of file batch.F90.

◆ batch_remote_access_stop()

subroutine batch_oct_m::batch_remote_access_stop	(	class(batch_t), intent(inout)	this,
		type(mpi_win), intent(inout)	rma_win
	)

private

stop the remote access to the batch

If the rma window handle is valid, the window is freed.

Note: side effect the batch pack level is decreased by one.

Definition at line 1142 of file batch.F90.

◆ batch_copy_data_to()

subroutine batch_oct_m::batch_copy_data_to	(	class(batch_t), intent(in)	this,
		integer, intent(in)	np,
		class(batch_t), intent(inout)	dest,
		logical, intent(in), optional	async
	)

private

copy data to another batch.

Parameters

[in]	this	source batch
[in]	np	number of points to copy for each mesh function
[in,out]	dest	destination batch
[in]	async	asynchronous GPU operations or not

Definition at line 1161 of file batch.F90.

◆ batch_check_compatibility_with()

subroutine batch_oct_m::batch_check_compatibility_with	(	class(batch_t), intent(in)	this,
		class(batch_t), intent(in)	target,
		logical, intent(in), optional	only_check_dim
	)

private

check whether two batches have compatible dimensions (and type)

Definition at line 1228 of file batch.F90.

◆ batch_build_indices()

subroutine batch_oct_m::batch_build_indices	(	class(batch_t), intent(inout)	this,
		integer, intent(in)	st_start,
		integer, intent(in)	st_end
	)

private

build the index ist(:) and ist_idim_index(:,:) and set pack_size

Definition at line 1249 of file batch.F90.

◆ dbatch_init_with_memory_3()

subroutine batch_oct_m::dbatch_init_with_memory_3	(	class(batch_t), intent(out)	this,
		integer, intent(in)	dim,
		integer, intent(in)	st_start,
		integer, intent(in)	st_end,
		real(real64), dimension(:, :, st_start:), intent(in), target, contiguous	psi
	)

private

initialize a batch with an rank-3 array of TYPE_FLOAT valued mesh functions psi.

The TYPE_FLOAT valued mesh functions psi are expected to be of dimensions (1:np_batch, 1:dim, st_start:st_end) where np_batch can be either np or np_part

Definition at line 1331 of file batch.F90.

◆ dbatch_init_with_memory_2()

subroutine batch_oct_m::dbatch_init_with_memory_2	(	class(batch_t), intent(out)	this,
		integer, intent(in)	dim,
		integer, intent(in)	st_start,
		integer, intent(in)	st_end,
		real(real64), dimension(:, :), intent(in), target, contiguous	psi
	)

private

initialize a batch with an rank-2 array of TYPE_FLOAT valued mesh functions psi.

The TYPE_FLOAT valued mesh functions psi are expected to be either of the following dimensions:

(1:np_batch, st_start:st_end)
(1:np_batch, 1:dim)

where np_batch can be either np or np_part.

Note: Either dim==1 or st_start==st_end has to be fulfilled.

Definition at line 1368 of file batch.F90.

◆ dbatch_init_with_memory_1()

subroutine batch_oct_m::dbatch_init_with_memory_1	(	class(batch_t), intent(out)	this,
		real(real64), dimension(:), intent(in), target, contiguous	psi
	)

private

initialize a batch with an rank-1 array of TYPE_FLOAT valued mesh functions psi.

The TYPE_FLOAT valued mesh functions psi are expected to be of dimensions (1:np_batch) where np_batch can be either np or np_part.

Note: The given mesh function is always copied into the first slot of the batch, i.e. idim=1 and s_start=st_end=1.

Definition at line 1397 of file batch.F90.

◆ dbatch_allocate_unpacked_host()

subroutine batch_oct_m::dbatch_allocate_unpacked_host ( class(batch_t), intent(inout) this )

private

allocate host (CPU) memory for unpacked data of type TYPE_FLOAT

This routine takes care of the allocating the memory on the host. If requested (special_memory), pinned memory can be allocated, which optimizes the transfer to GPUs under certain conditions.

Definition at line 1418 of file batch.F90.

◆ dbatch_allocate_packed_host()

subroutine batch_oct_m::dbatch_allocate_packed_host ( class(batch_t), intent(inout) this )

private

allocate host (CPU) memory for packed data of type TYPE_FLOAT

This routine takes care of the allocating the memory on the host. If requested (special_memory), pinned memory can be allocated, which optimizes the transfer to GPUs under certain conditions.

Definition at line 1444 of file batch.F90.

◆ dbatch_init()

subroutine, public batch_oct_m::dbatch_init	(	class(batch_t), intent(inout)	this,
		integer, intent(in)	dim,
		integer, intent(in)	st_start,
		integer, intent(in)	st_end,
		integer, intent(in)	np,
		logical, intent(in), optional	special,
		logical, intent(in), optional	packed
	)

initialize a TYPE_FLOAT valued batch to given size without providing external memory

Parameters

[in,out]	this	the batch to initialize
[in]	dim	Spinor dimension of the state (one, or two for spinors)
[in]	st_start	index of first state of the batch
[in]	st_end	index of last state of the batch
[in]	np	number of points in each function (this can be np or np_part)
[in]	special	If .true., the allocation will be handled in C (to use pinned memory for GPUs). Default = .false.
[in]	packed	If .true. the batch will be initialized in packed form. Default = .false.

Definition at line 1468 of file batch.F90.

◆ dbatch_pack_copy()

subroutine batch_oct_m::dbatch_pack_copy ( class(batch_t), intent(inout) this )

private

copy data from the unpacked to the packed arrays

Note: this routine does not affect the packing status, and does not allocate or free memory.

Definition at line 1507 of file batch.F90.

◆ dbatch_unpack_copy()

subroutine batch_oct_m::dbatch_unpack_copy ( class(batch_t), intent(inout) this )

private

copy data from the packed to the unpacked arrays

Note: this routine does not affect the packing status, and does not allocate or free memory.

Definition at line 1537 of file batch.F90.

◆ zbatch_init_with_memory_3()

subroutine batch_oct_m::zbatch_init_with_memory_3	(	class(batch_t), intent(out)	this,
		integer, intent(in)	dim,
		integer, intent(in)	st_start,
		integer, intent(in)	st_end,
		complex(real64), dimension(:, :, st_start:), intent(in), target, contiguous	psi
	)

private

initialize a batch with an rank-3 array of TYPE_CMPLX valued mesh functions psi.

The TYPE_CMPLX valued mesh functions psi are expected to be of dimensions (1:np_batch, 1:dim, st_start:st_end) where np_batch can be either np or np_part

Definition at line 1636 of file batch.F90.

◆ zbatch_init_with_memory_2()

subroutine batch_oct_m::zbatch_init_with_memory_2	(	class(batch_t), intent(out)	this,
		integer, intent(in)	dim,
		integer, intent(in)	st_start,
		integer, intent(in)	st_end,
		complex(real64), dimension(:, :), intent(in), target, contiguous	psi
	)

private

initialize a batch with an rank-2 array of TYPE_CMPLX valued mesh functions psi.

The TYPE_CMPLX valued mesh functions psi are expected to be either of the following dimensions:

(1:np_batch, st_start:st_end)
(1:np_batch, 1:dim)

where np_batch can be either np or np_part.

Note: Either dim==1 or st_start==st_end has to be fulfilled.

Definition at line 1673 of file batch.F90.

◆ zbatch_init_with_memory_1()

subroutine batch_oct_m::zbatch_init_with_memory_1	(	class(batch_t), intent(out)	this,
		complex(real64), dimension(:), intent(in), target, contiguous	psi
	)

private

initialize a batch with an rank-1 array of TYPE_CMPLX valued mesh functions psi.

The TYPE_CMPLX valued mesh functions psi are expected to be of dimensions (1:np_batch) where np_batch can be either np or np_part.

Note: The given mesh function is always copied into the first slot of the batch, i.e. idim=1 and s_start=st_end=1.

Definition at line 1702 of file batch.F90.

◆ zbatch_allocate_unpacked_host()

subroutine batch_oct_m::zbatch_allocate_unpacked_host ( class(batch_t), intent(inout) this )

private

allocate host (CPU) memory for unpacked data of type TYPE_CMPLX

This routine takes care of the allocating the memory on the host. If requested (special_memory), pinned memory can be allocated, which optimizes the transfer to GPUs under certain conditions.

Definition at line 1723 of file batch.F90.

◆ zbatch_allocate_packed_host()

subroutine batch_oct_m::zbatch_allocate_packed_host ( class(batch_t), intent(inout) this )

private

allocate host (CPU) memory for packed data of type TYPE_CMPLX

This routine takes care of the allocating the memory on the host. If requested (special_memory), pinned memory can be allocated, which optimizes the transfer to GPUs under certain conditions.

Definition at line 1749 of file batch.F90.

◆ zbatch_init()

subroutine, public batch_oct_m::zbatch_init	(	class(batch_t), intent(inout)	this,
		integer, intent(in)	dim,
		integer, intent(in)	st_start,
		integer, intent(in)	st_end,
		integer, intent(in)	np,
		logical, intent(in), optional	special,
		logical, intent(in), optional	packed
	)

initialize a TYPE_CMPLX valued batch to given size without providing external memory

Parameters

[in,out]	this	the batch to initialize
[in]	dim	Spinor dimension of the state (one, or two for spinors)
[in]	st_start	index of first state of the batch
[in]	st_end	index of last state of the batch
[in]	np	number of points in each function (this can be np or np_part)
[in]	special	If .true., the allocation will be handled in C (to use pinned memory for GPUs). Default = .false.
[in]	packed	If .true. the batch will be initialized in packed form. Default = .false.

Definition at line 1773 of file batch.F90.

◆ zbatch_pack_copy()

subroutine batch_oct_m::zbatch_pack_copy ( class(batch_t), intent(inout) this )

private

copy data from the unpacked to the packed arrays

Note: this routine does not affect the packing status, and does not allocate or free memory.

Definition at line 1812 of file batch.F90.

◆ zbatch_unpack_copy()

subroutine batch_oct_m::zbatch_unpack_copy ( class(batch_t), intent(inout) this )

private

copy data from the packed to the unpacked arrays

Note: this routine does not affect the packing status, and does not allocate or free memory.

Definition at line 1842 of file batch.F90.

Variable Documentation

◆ batch_not_packed

integer, parameter, public batch_oct_m::batch_not_packed = 0

functions are stored in CPU memory, unpacked order

Definition at line 276 of file batch.F90.

◆ batch_packed

integer, parameter, public batch_oct_m::batch_packed = 1

functions are stored in CPU memory, in transposed (packed) order

Definition at line 276 of file batch.F90.

◆ batch_device_packed

integer, parameter, public batch_oct_m::batch_device_packed = 2

functions are stored in device memory in packed order

Definition at line 276 of file batch.F90.

◆ cl_pack_max_buffer_size

integer, parameter batch_oct_m::cl_pack_max_buffer_size = 4

private

this value controls the size (in number of wave-functions) of the buffer used to copy states to the opencl device.

Definition at line 281 of file batch.F90.

Detailed Description

Data Types

Functions/Subroutines

Variables

Function/Subroutine Documentation

◆ batch_end()

◆ batch_deallocate_unpacked_host()

◆ batch_deallocate_packed_host()

◆ batch_deallocate_packed_device()

◆ batch_allocate_unpacked_host()

◆ batch_allocate_packed_host()

◆ batch_allocate_packed_device()

◆ batch_init_empty()

◆ batch_clone_to()

◆ batch_clone_to_array()

◆ batch_copy_to()

◆ batch_type()

◆ batch_type_as_integer()

◆ batch_status()

◆ batch_is_packed()

◆ batch_pack_total_size()

◆ batch_do_pack()

◆ batch_do_unpack()

◆ batch_finish_unpack()

◆ batch_write_unpacked_to_device()

◆ batch_read_device_to_unpacked()

◆ batch_write_packed_to_device()

◆ batch_read_device_to_packed()

◆ batch_inv_index()

◆ batch_ist_idim_to_linear()

◆ batch_linear_to_ist()

◆ batch_linear_to_idim()

◆ batch_remote_access_start()

◆ batch_remote_access_stop()

◆ batch_copy_data_to()

◆ batch_check_compatibility_with()

◆ batch_build_indices()

◆ dbatch_init_with_memory_3()

◆ dbatch_init_with_memory_2()

◆ dbatch_init_with_memory_1()

◆ dbatch_allocate_unpacked_host()

◆ dbatch_allocate_packed_host()

◆ dbatch_init()

◆ dbatch_pack_copy()

◆ dbatch_unpack_copy()

◆ zbatch_init_with_memory_3()

◆ zbatch_init_with_memory_2()

◆ zbatch_init_with_memory_1()

◆ zbatch_allocate_unpacked_host()

◆ zbatch_allocate_packed_host()

◆ zbatch_init()

◆ zbatch_pack_copy()

◆ zbatch_unpack_copy()

Variable Documentation

◆ batch_not_packed

◆ batch_packed

◆ batch_device_packed

◆ cl_pack_max_buffer_size