Octopus
exponential.F90
Go to the documentation of this file.
1!! Copyright (C) 2002-2006 M. Marques, A. Castro, A. Rubio, G. Bertsch
2!! Copyright (C) 2019 M. Oliveira
3!!
4!! This program is free software; you can redistribute it and/or modify
5!! it under the terms of the GNU General Public License as published by
6!! the Free Software Foundation; either version 2, or (at your option)
7!! any later version.
8!!
9!! This program is distributed in the hope that it will be useful,
10!! but WITHOUT ANY WARRANTY; without even the implied warranty of
11!! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12!! GNU General Public License for more details.
13!!
14!! You should have received a copy of the GNU General Public License
15!! along with this program; if not, write to the Free Software
16!! Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
17!! 02110-1301, USA.
18!!
19
20#include "global.h"
21
23 use accel_oct_m
24 use batch_oct_m
26 use blas_oct_m
28 use debug_oct_m
29 use global_oct_m
30 use grid_oct_m
34 use, intrinsic :: iso_fortran_env
38 use math_oct_m
39 use mesh_oct_m
44 use parser_oct_m
48 use types_oct_m
50 use xc_oct_m
51
52 implicit none
53
54 private
55 public :: &
61
62 integer, public, parameter :: &
63 EXP_LANCZOS = 2, &
64 exp_taylor = 3, &
66
67 type exponential_t
68 private
69 integer, public :: exp_method
70 real(real64) :: lanczos_tol
71 real(real64) :: chebyshev_tol
72 integer, public :: exp_order
73 integer :: arnoldi_gs
74 logical, public :: full_batch = .false.
75 contains
76 procedure :: apply_batch => exponential_apply_batch
77 procedure :: apply_single => exponential_apply_single
78 procedure :: apply_phi_batch => exponential_apply_phi_batch
79 end type exponential_t
80
81contains
82
83 ! ---------------------------------------------------------
84 subroutine exponential_init(te, namespace, full_batch)
85 type(exponential_t), intent(out) :: te
86 type(namespace_t), intent(in) :: namespace
87 logical, optional, intent(in) :: full_batch
88
89 push_sub(exponential_init)
90
91 !%Variable TDExponentialMethod
92 !%Type integer
93 !%Default taylor
94 !%Section Time-Dependent::Propagation
95 !%Description
96 !% Method used to numerically calculate the exponential of the Hamiltonian,
97 !% a core part of the full algorithm used to approximate the evolution
98 !% operator, specified through the variable <tt>TDPropagator</tt>.
99 !% In the case of using the Magnus method, described below, the action of the exponential
100 !% of the Magnus operator is also calculated through the algorithm specified
101 !% by this variable.
102 !%Option lanczos 2
103 !% Allows for larger time-steps.
104 !% However, the larger the time-step, the longer the computational time per time-step.
105 !% In certain cases, if the time-step is too large, the code will emit a warning
106 !% whenever it considers that the evolution may not be properly proceeding --
107 !% the Lanczos process did not converge. The method consists in a Krylov
108 !% subspace approximation of the action of the exponential
109 !% (see M. Hochbruck and C. Lubich, <i>SIAM J. Numer. Anal.</i> <b>34</b>, 1911 (1997) for details).
110 !% The performance of the method is controlled by the tolerance (controlled by <tt>TDLanczosTol</tt>).
111 !% The smaller the tolerance, the more precisely the exponential
112 !% is calculated, but also the larger the dimension of the Arnoldi
113 !% subspace. If the maximum dimension (currently 200) is not enough to meet the criterion,
114 !% the above-mentioned warning is emitted.
115 !% Be aware that the larger the required dimension of the Krylov subspace, the larger
116 !% the memory required for this method. So if you run out of memory, try to reduce
117 !% the time step.
118 !%Option taylor 3
119 !% This method amounts to a straightforward application of the definition of
120 !% the exponential of an operator, in terms of its Taylor expansion.
121 !%
122 !% <math>\exp_{\rm STD} (-i\delta t H) = \sum_{i=0}^{k} {(-i\delta t)^i\over{i!}} H^i.</math>
123 !%
124 !% The order <i>k</i> is determined by variable <tt>TDExpOrder</tt>.
125 !% Some numerical considerations from <a href=http://www.phys.washington.edu/~bertsch/num3.ps>
126 !% Jeff Giansiracusa and George F. Bertsch</a>
127 !% suggest the 4th order as especially suitable and stable.
128 !%Option chebyshev 4
129 !% In principle, the Chebyshev expansion
130 !% of the exponential represents it more accurately than the canonical or standard expansion.
131 !% <tt>TDChebyshevTol</tt> determines the tolerance to which the expansion is computed.
132 !%
133 !% There exists a closed analytic form for the coefficients of the exponential in terms
134 !% of Chebyshev polynomials:
135 !%
136 !% <math>\exp_{\rm CHEB} \left( -i\delta t H \right) = \sum_{k=0}^{\infty} (2-\delta_{k0})(-i)^{k}J_k(\delta t) T_k(H),</math>
137 !%
138 !% where <math>J_k</math> are the Bessel functions of the first kind, and H has to be previously
139 !% scaled to <math>[-1,1]</math>.
140 !% See H. Tal-Ezer and R. Kosloff, <i>J. Chem. Phys.</i> <b>81</b>,
141 !% 3967 (1984); R. Kosloff, <i>Annu. Rev. Phys. Chem.</i> <b>45</b>, 145 (1994);
142 !% C. W. Clenshaw, <i>MTAC</i> <b>9</b>, 118 (1955).
143 !%End
144 call parse_variable(namespace, 'TDExponentialMethod', exp_taylor, te%exp_method)
145
146 select case (te%exp_method)
147 case (exp_taylor)
148 case (exp_chebyshev)
149 !%Variable TDChebyshevTol
150 !%Type float
151 !%Default 1e-10
152 !%Section Time-Dependent::Propagation
153 !%Description
154 !% An internal tolerance variable for the Chebyshev method. The smaller, the more
155 !% precisely the exponential is calculated and the more iterations are needed, i.e.,
156 !% it becomes more expensive. The expansion is terminated once the error estimate
157 !% is below this tolerance.
158 !%End
159 call parse_variable(namespace, 'TDChebyshevTol', 1e-10_real64, te%chebyshev_tol)
160 if (te%chebyshev_tol <= m_zero) call messages_input_error(namespace, 'TDChebyshevTol')
161 case (exp_lanczos)
162 !%Variable TDLanczosTol
163 !%Type float
164 !%Default 1e-6
165 !%Section Time-Dependent::Propagation
166 !%Description
167 !% An internal tolerance variable for the Lanczos method. The smaller, the more
168 !% precisely the exponential is calculated, and also the bigger the dimension
169 !% of the Krylov subspace needed to perform the algorithm. One should carefully
170 !% make sure that this value is not too big, or else the evolution will be
171 !% wrong.
172 !%End
173 call parse_variable(namespace, 'TDLanczosTol', 1e-6_real64, te%lanczos_tol)
174 if (te%lanczos_tol <= m_zero) call messages_input_error(namespace, 'TDLanczosTol')
175
176 case default
177 call messages_input_error(namespace, 'TDExponentialMethod')
178 end select
179 call messages_print_var_option('TDExponentialMethod', te%exp_method, namespace=namespace)
180
181 if (te%exp_method == exp_taylor) then
182 !%Variable TDExpOrder
183 !%Type integer
184 !%Default 4
185 !%Section Time-Dependent::Propagation
186 !%Description
187 !% For <tt>TDExponentialMethod</tt> = <tt>taylor</tt>,
188 !% the order to which the exponential is expanded.
189 !%End
190 call parse_variable(namespace, 'TDExpOrder', default__tdexporder, te%exp_order)
191 if (te%exp_order < 2) call messages_input_error(namespace, 'TDExpOrder')
192 else
193 if (parse_is_defined(namespace, 'TDExpOrder')) then
194 message(1) = "TDExpOrder is only relevant for TDExponentialMethod = taylor"
195 call messages_fatal(1, namespace=namespace)
196 end if
197 end if
198
199 te%arnoldi_gs = option__arnoldiorthogonalization__cgs
200 if (te%exp_method == exp_lanczos) then
201 !%Variable ArnoldiOrthogonalization
202 !%Type integer
203 !%Section Time-Dependent::Propagation
204 !%Description
205 !% The orthogonalization method used for the Arnoldi procedure.
206 !% Only for TDExponentialMethod = lanczos.
207 !%Option cgs 3
208 !% Classical Gram-Schmidt (CGS) orthogonalization.
209 !% The algorithm is defined in Giraud et al., Computers and Mathematics with Applications 50, 1069 (2005).
210 !%Option drcgs 5
211 !% Classical Gram-Schmidt orthogonalization with double-step reorthogonalization.
212 !% The algorithm is taken from Giraud et al., Computers and Mathematics with Applications 50, 1069 (2005).
213 !% According to this reference, this is much more precise than CGS or MGS algorithms.
214 !%End
215 call parse_variable(namespace, 'ArnoldiOrthogonalization', option__arnoldiorthogonalization__cgs, &
216 te%arnoldi_gs)
217 end if
218
219 ! do lanczos expansion for full batch?
220 te%full_batch = optional_default(full_batch, te%full_batch)
221
222 pop_sub(exponential_init)
223 end subroutine exponential_init
224
225 ! ---------------------------------------------------------
226 subroutine exponential_copy(teo, tei)
227 type(exponential_t), intent(inout) :: teo
228 type(exponential_t), intent(in) :: tei
229
230 push_sub(exponential_copy)
231
232 teo%exp_method = tei%exp_method
233 teo%lanczos_tol = tei%lanczos_tol
234 teo%exp_order = tei%exp_order
235 teo%arnoldi_gs = tei%arnoldi_gs
236
237 pop_sub(exponential_copy)
238 end subroutine exponential_copy
239
240 ! ---------------------------------------------------------
242 subroutine exponential_apply_single(te, namespace, mesh, hm, zpsi, ist, ik, deltat, vmagnus, imag_time)
243 class(exponential_t), intent(inout) :: te
244 type(namespace_t), intent(in) :: namespace
245 class(mesh_t), intent(in) :: mesh
246 type(hamiltonian_elec_t), intent(inout) :: hm
247 integer, intent(in) :: ist
248 integer, intent(in) :: ik
249 complex(real64), contiguous, intent(inout) :: zpsi(:, :)
250 real(real64), intent(in) :: deltat
251 real(real64), optional, intent(in) :: vmagnus(mesh%np, hm%d%nspin, 2)
252 logical, optional, intent(in) :: imag_time
253
254 type(wfs_elec_t) :: psib, inh_psib
255 complex(real64), allocatable :: zpsi_inh(:, :)
256
258
259 !We apply the phase only to np points, and the phase for the np+1 to np_part points
260 !will be treated as a phase correction in the Hamiltonian
261 if (hm%phase%is_allocated()) then
262 call hm%phase%apply_to_single(zpsi, mesh%np, hm%d%dim, ik, .false.)
263 end if
264
265 call wfs_elec_init(psib, hm%d%dim, ist, ist, zpsi, ik)
266
267 if (hamiltonian_elec_inh_term(hm)) then
268 safe_allocate(zpsi_inh(1:mesh%np_part, 1:hm%d%dim))
269 call states_elec_get_state(hm%inh_st, mesh, ist, ik, zpsi_inh(:, :))
270 call wfs_elec_init(inh_psib, hm%d%dim, ist, ist, zpsi_inh, ik)
271 call te%apply_batch(namespace, mesh, hm, psib, deltat, &
272 vmagnus=vmagnus, imag_time=imag_time, inh_psib=inh_psib)
273 call inh_psib%end()
274 safe_deallocate_a(zpsi_inh)
275 else
276 call te%apply_batch(namespace, mesh, hm, psib, deltat, &
277 vmagnus=vmagnus, imag_time=imag_time)
278 end if
279
280 call psib%end()
281
282 if (hm%phase%is_allocated()) then
283 call hm%phase%apply_to_single(zpsi, mesh%np, hm%d%dim, ik, .true.)
284 end if
285
287 end subroutine exponential_apply_single
288
289 ! ---------------------------------------------------------
306 ! ---------------------------------------------------------
307 subroutine exponential_apply_batch(te, namespace, mesh, hm, psib, deltat, psib2, deltat2, vmagnus, imag_time, inh_psib)
308 class(exponential_t), intent(inout) :: te
309 type(namespace_t), intent(in) :: namespace
310 class(mesh_t), intent(in) :: mesh
311 class(hamiltonian_abst_t), intent(inout) :: hm
312 class(batch_t), intent(inout) :: psib
313 real(real64), intent(in) :: deltat
314 class(batch_t), optional, intent(inout) :: psib2
315 real(real64), optional, intent(in) :: deltat2
316 real(real64), optional, intent(in) :: vmagnus(:,:,:) !(mesh%np, hm%d%nspin, 2)
317 logical, optional, intent(in) :: imag_time
318 class(batch_t), optional, intent(inout) :: inh_psib
319
320 complex(real64) :: deltat_, deltat2_
321 class(chebyshev_function_t), pointer :: chebyshev_function, chebyshev_function_dt2
322 logical :: imag_time_
323
325 call profiling_in("EXPONENTIAL_BATCH")
326
327 assert(psib%type() == type_cmplx)
328
329 assert(present(psib2) .eqv. present(deltat2))
330 if (present(inh_psib)) then
331 assert(inh_psib%nst == psib%nst)
332 end if
333
334 if (present(vmagnus)) then
335 select type(hm)
336 class is (hamiltonian_elec_t)
337 assert(size(vmagnus, 1) >= mesh%np)
338 assert(size(vmagnus, 2) == hm%d%nspin)
339 assert(size(vmagnus, 3) == 2)
340 class default
341 write(message(1), '(a)') 'Magnus operators only implemented for electrons at the moment'
342 call messages_fatal(1, namespace=namespace)
343 end select
344 end if
345
346 deltat2_ = cmplx(optional_default(deltat2, m_zero), m_zero, real64)
347
348 imag_time_ = optional_default(imag_time, .false.)
349 if (imag_time_) then
350 deltat_ = -m_zi*deltat
351 if (present(deltat2)) deltat2_ = m_zi*deltat2
352 else
353 deltat_ = cmplx(deltat, m_zero, real64)
354 if (present(deltat2)) deltat2_ = cmplx(deltat2, m_zero, real64)
355 end if
356
357 if (.not. hm%is_hermitian() .and. te%exp_method == exp_chebyshev) then
358 write(message(1), '(a)') 'The Chebyshev expansion cannot be used for non-Hermitian operators.'
359 write(message(2), '(a)') 'Please use the Lanczos exponentiation scheme ("TDExponentialMethod = lanczos")'
360 write(message(3), '(a)') 'or the Taylor expansion ("TDExponentialMethod = taylor") method.'
361 call messages_fatal(3, namespace=namespace)
362 end if
363
364 select case (te%exp_method)
365 case (exp_taylor)
366 ! Note that delttat2_ is only initialized if deltat2 is present.
367 if (present(deltat2)) then
368 call exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat_, &
369 psib2, deltat2_, vmagnus)
370 else
371 call exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat_, &
372 vmagnus=vmagnus)
373 end if
374 if (present(inh_psib)) then
375 if (present(deltat2)) then
376 call exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat_, &
377 psib2, deltat2_, vmagnus, inh_psib)
378 else
379 call exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat_, &
380 vmagnus=vmagnus, inh_psib=inh_psib)
381 end if
382 end if
383
384 case (exp_lanczos)
385 if (present(psib2)) call psib%copy_data_to(mesh%np, psib2)
386 call exponential_lanczos_batch(te, namespace, mesh, hm, psib, deltat_, vmagnus)
387 if (present(inh_psib)) then
388 call exponential_lanczos_batch(te, namespace, mesh, hm, psib, deltat_, vmagnus, inh_psib)
389 end if
390 if (present(psib2)) then
391 call exponential_lanczos_batch(te, namespace, mesh, hm, psib2, deltat2_, vmagnus)
392 if (present(inh_psib)) then
393 call exponential_lanczos_batch(te, namespace, mesh, hm, psib2, deltat2_, vmagnus, inh_psib)
394 end if
395 end if
396
397 case (exp_chebyshev)
398 if (present(inh_psib)) then
399 write(message(1), '(a)') 'Chebyshev exponential ("TDExponentialMethod = chebyshev")'
400 write(message(2), '(a)') 'with inhomogeneous term is not implemented'
401 call messages_fatal(2, namespace=namespace)
402 end if
403 ! initialize classes for computing coefficients
404 if (imag_time_) then
405 chebyshev_function => chebyshev_exp_imagtime_t(hm%spectral_half_span, hm%spectral_middle_point, deltat)
406 if (present(psib2)) then
407 chebyshev_function_dt2 => chebyshev_exp_imagtime_t(hm%spectral_half_span, hm%spectral_middle_point, deltat2)
408 end if
409 else
410 chebyshev_function => chebyshev_exp_t(hm%spectral_half_span, hm%spectral_middle_point, deltat)
411 if (present(psib2)) then
412 chebyshev_function_dt2 => chebyshev_exp_t(hm%spectral_half_span, hm%spectral_middle_point, deltat2)
413 end if
414 end if
415 if (present(psib2)) call psib%copy_data_to(mesh%np, psib2)
416 call exponential_cheby_batch(te, namespace, mesh, hm, psib, chebyshev_function, vmagnus)
417 deallocate(chebyshev_function)
418 if (present(psib2)) then
419 call exponential_cheby_batch(te, namespace, mesh, hm, psib2, chebyshev_function_dt2, vmagnus)
420 deallocate(chebyshev_function_dt2)
421 end if
422
423 end select
424
425 call profiling_out("EXPONENTIAL_BATCH")
427 end subroutine exponential_apply_batch
428
429 ! ---------------------------------------------------------
430 subroutine exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat, psib2, deltat2, vmagnus, inh_psib, phik_shift)
431 type(exponential_t), intent(inout) :: te
432 type(namespace_t), intent(in) :: namespace
433 class(mesh_t), intent(in) :: mesh
434 class(hamiltonian_abst_t), intent(inout) :: hm
435 class(batch_t), intent(inout) :: psib
436 complex(real64), intent(in) :: deltat
437 class(batch_t), optional, intent(inout) :: psib2
438 complex(real64), optional, intent(in) :: deltat2
439 real(real64), optional, intent(in) :: vmagnus(:,:,:) !(mesh%np, hm%d%nspin, 2)
440 class(batch_t), optional, intent(inout) :: inh_psib
441 integer, optional, intent(in) :: phik_shift
442
443 complex(real64) :: zfact, zfact2
444 integer :: iter, denom, phik_shift_
445 logical :: zfact_is_real
446 class(batch_t), allocatable :: psi1b, hpsi1b
447
449 call profiling_in("EXP_TAYLOR_BATCH")
450
451 call psib%clone_to(psi1b)
452 call psib%clone_to(hpsi1b)
453
454 zfact = m_z1
455 zfact2 = m_z1
456 zfact_is_real = abs(deltat-real(deltat)) < m_epsilon
457
458 if (present(psib2)) call psib%copy_data_to(mesh%np, psib2)
459
460 if (present(inh_psib)) then
461 zfact = zfact*deltat
462 call batch_axpy(mesh%np, real(zfact), inh_psib, psib)
463
464 if (present(psib2)) then
465 zfact2 = zfact2*deltat2
466 call batch_axpy(mesh%np, real(zfact2), inh_psib, psib2)
467 end if
468 end if
469
470 ! shift the denominator by this shift for the phi_k functions
471 phik_shift_ = optional_default(phik_shift, 0)
472
473 do iter = 1, te%exp_order
474 denom = iter+phik_shift_
475 if (present(inh_psib)) denom = denom + 1
476 zfact = zfact*(-m_zi*deltat)/denom
477 if (present(deltat2)) zfact2 = zfact2*(-m_zi*deltat2)/denom
478 zfact_is_real = .not. zfact_is_real
479 ! FIXME: need a test here for runaway exponential, e.g. for too large dt.
480 ! in runaway case the problem is really hard to trace back: the positions
481 ! go haywire on the first step of dynamics (often NaN) and with debugging options
482 ! the code stops in ZAXPY below without saying why.
483
484 if (iter /= 1) then
485 call operate_batch(hm, namespace, mesh, psi1b, hpsi1b, vmagnus)
486 else
487 if (present(inh_psib)) then
488 call operate_batch(hm, namespace, mesh, inh_psib, hpsi1b, vmagnus)
489 else
490 call operate_batch(hm, namespace, mesh, psib, hpsi1b, vmagnus)
491 end if
492 end if
493
494 if (zfact_is_real) then
495 call batch_axpy(mesh%np, real(zfact), hpsi1b, psib)
496 if (present(psib2)) call batch_axpy(mesh%np, real(zfact2), hpsi1b, psib2)
497 else
498 call batch_axpy(mesh%np, zfact, hpsi1b, psib)
499 if (present(psib2)) call batch_axpy(mesh%np, zfact2, hpsi1b, psib2)
500 end if
501
502 if (iter /= te%exp_order) call hpsi1b%copy_data_to(mesh%np, psi1b)
503
504 end do
505
506 call psi1b%end()
507 call hpsi1b%end()
508 safe_deallocate_a(psi1b)
509 safe_deallocate_a(hpsi1b)
510
511 call profiling_out("EXP_TAYLOR_BATCH")
514
515 subroutine exponential_lanczos_batch(te, namespace, mesh, hm, psib, deltat, vmagnus, inh_psib)
516 type(exponential_t), intent(inout) :: te
517 type(namespace_t), intent(in) :: namespace
518 class(mesh_t), intent(in) :: mesh
519 class(hamiltonian_abst_t), intent(inout) :: hm
520 class(batch_t), intent(inout) :: psib
521 complex(real64), intent(in) :: deltat
522 real(real64), optional, intent(in) :: vmagnus(:,:,:) !(mesh%np, hm%d%nspin, 2)
523 class(batch_t), optional, intent(in) :: inh_psib
524
525 class(batch_t), allocatable :: tmpb
526
528
529 if (present(inh_psib)) then
530 call inh_psib%clone_to(tmpb, copy_data=.true.)
531 ! psib = psib + deltat * phi1(-i*deltat*H) inh_psib
532 call exponential_lanczos_function_batch(te, namespace, mesh, hm, tmpb, deltat, phi1, vmagnus)
533 call batch_axpy(mesh%np, real(deltat, real64), tmpb, psib)
534 call tmpb%end()
535 else
536 call exponential_lanczos_function_batch(te, namespace, mesh, hm, psib, deltat, exponential, vmagnus)
537 end if
538
540 end subroutine exponential_lanczos_batch
541
542 ! ---------------------------------------------------------
554 subroutine exponential_lanczos_function_batch(te, namespace, mesh, hm, psib, deltat, fun, vmagnus)
555 type(exponential_t), intent(inout) :: te
556 type(namespace_t), intent(in) :: namespace
557 class(mesh_t), intent(in) :: mesh
558 class(hamiltonian_abst_t), intent(inout) :: hm
559 class(batch_t), intent(inout) :: psib
560 complex(real64), intent(in) :: deltat
561 interface
562 complex(real64) function fun(z)
563 import real64
564 complex(real64), intent(in) :: z
565 end
566 end interface
567 real(real64), optional, intent(in) :: vmagnus(:,:,:) !(mesh%np, hm%d%nspin, 2)
568
569 integer :: iter, l, ii, ist, max_initialized
570 complex(real64), allocatable :: hamilt(:,:,:), expo(:,:,:)
571 real(real64), allocatable :: beta(:), res(:), norm(:)
572 integer, parameter :: max_order = 200
573 type(batch_p_t), allocatable :: vb(:) ! Krylov subspace vectors
574
576 call profiling_in("EXP_LANCZOS_FUN_BATCH")
577
578 if (te%exp_method /= exp_lanczos) then
579 message(1) = "The exponential method needs to be set to Lanzcos (TDExponentialMethod=lanczos)."
580 call messages_fatal(1)
581 end if
582
583 safe_allocate(beta(1:psib%nst))
584 safe_allocate(res(1:psib%nst))
585 safe_allocate(norm(1:psib%nst))
586 safe_allocate(vb(1:max_order))
587 call psib%clone_to(vb(1)%p)
588 max_initialized = 1
589
590 call psib%copy_data_to(mesh%np, vb(1)%p, async=.true.)
591 call mesh_batch_nrm2(mesh, vb(1)%p, beta)
592
593 if (te%full_batch) beta = norm2(beta)
594
595 ! If we have a null vector, no need to compute the action of the exponential.
596 if (all(abs(beta) <= 1.0e-12_real64)) then
597 safe_deallocate_a(beta)
598 safe_deallocate_a(res)
599 safe_deallocate_a(norm)
600 call vb(1)%p%end()
601 safe_deallocate_a(vb)
602 call profiling_out("EXP_LANCZOS_FUN_BATCH")
604 return
605 end if
606
607 call batch_scal(mesh%np, m_one/beta, vb(1)%p, a_full = .false.)
608
609 safe_allocate(hamilt(1:max_order+1, 1:max_order+1, 1:psib%nst))
610 safe_allocate( expo(1:max_order+1, 1:max_order+1, 1:psib%nst))
611
612 ! This is the Lanczos loop...
613 do iter = 1, max_order-1
614 call psib%clone_to(vb(iter + 1)%p)
615 max_initialized = iter + 1
616
617 ! to apply the Hamiltonian
618 call operate_batch(hm, namespace, mesh, vb(iter)%p, vb(iter+1)%p, vmagnus)
619
620 ! We use either the Lanczos method (Hermitian case) or the Arnoldi method
621 if (hm%is_hermitian()) then
622 l = max(1, iter - 1)
623 hamilt(1:max(l-1, 1), iter, 1:psib%nst) = m_zero
624 else
625 l = 1
626 if (iter > 2) then
627 hamilt(iter, 1:iter-2, 1:psib%nst) = m_zero
628 end if
629 end if
630
631 ! Orthogonalize against previous vectors
632 call zmesh_batch_orthogonalization(mesh, iter - l + 1, vb(l:iter), vb(iter+1)%p, &
633 normalize = .false., overlap = hamilt(l:iter, iter, 1:psib%nst), norm = hamilt(iter + 1, iter, 1:psib%nst), &
634 gs_scheme = te%arnoldi_gs, full_batch = te%full_batch)
635
636 ! We now need to compute exp(Hm), where Hm is the projection of the linear transformation
637 ! of the Hamiltonian onto the Krylov subspace Km
638 ! See Eq. 4
639 !
640 ! Note that in the Hermitian case, we use the Lanczos algorithm that requires
641 ! only a tridiagonal matrix. Else we have an upper Hessenberg matrix.
642 do ii = 1, psib%nst
643 call zlalg_matrix_function(iter, -m_zi*deltat, hamilt(:,:,ii), expo(:,:,ii), fun, &
644 hm%is_hermitian(), tridiagonal=hm%is_hermitian())
645 res(ii) = abs(hamilt(iter + 1, iter, ii) * abs(expo(iter, 1, ii)))
646 end do !ii
647
648 ! We now estimate the error we made. This is given by the formula denoted Er2 in Sec. 5.2
649 if (all(abs(hamilt(iter + 1, iter, :)) < 1.0e4_real64*m_epsilon)) exit ! "Happy breakdown"
650 ! We normalize only if the norm is non-zero
651 ! see http://www.netlib.org/utk/people/JackDongarra/etemplates/node216.html#alg:arn0
652 norm = m_one
653 do ist = 1, psib%nst
654 if (abs(hamilt(iter + 1, iter, ist)) >= 1.0e4_real64 * m_epsilon) then
655 norm(ist) = m_one / abs(hamilt(iter + 1, iter, ist))
656 end if
657 end do
658
659 call batch_scal(mesh%np, norm, vb(iter+1)%p, a_full = .false.)
660
661 if (iter > 3 .and. all(res < te%lanczos_tol)) exit
662
663 end do !iter
664
665 if (iter == max_order) then ! Here one should consider the possibility of the happy breakdown.
666 write(message(1),'(a,i5,a,es9.2)') 'Lanczos exponential expansion did not converge after ', iter, &
667 ' iterations. Residual: ', maxval(res)
668 call messages_warning(1, namespace=namespace)
669 else
670 write(message(1),'(a,i5)') 'Debug: Lanczos exponential iterations: ', iter
671 call messages_info(1, namespace=namespace, debug_only=.true.)
672 end if
673
674 ! See Eq. 4 for the expression here
675 ! zpsi = nrm * V * expo(1:iter, 1) = nrm * V * expo * V^(T) * zpsi
676 call batch_scal(mesh%np, expo(1,1,1:psib%nst), psib, a_full = .false.)
677 ! TODO: We should have a routine batch_gemv for improved performance (see #1070 on gitlab)
678 do ii = 2, iter
679 call batch_axpy(mesh%np, beta(1:psib%nst)*expo(ii,1,1:psib%nst), vb(ii)%p, psib, a_full = .false.)
680 ! In order to apply the two exponentials, we must store the eigenvalues and eigenvectors given by zlalg_exp
681 ! And to recontruct here the exp(i*dt*H) for deltat2
682 end do
683
684 do ii = 1, max_initialized
685 call vb(ii)%p%end()
686 end do
687
688 safe_deallocate_a(vb)
689 safe_deallocate_a(hamilt)
690 safe_deallocate_a(expo)
691 safe_deallocate_a(beta)
692 safe_deallocate_a(res)
693 safe_deallocate_a(norm)
694
695 call accel_finish()
696
697 call profiling_out("EXP_LANCZOS_FUN_BATCH")
698
701
702
703 ! ---------------------------------------------------------
715 subroutine exponential_cheby_batch(te, namespace, mesh, hm, psib, chebyshev_function, vmagnus)
716 type(exponential_t), intent(inout) :: te
717 type(namespace_t), intent(in) :: namespace
718 class(mesh_t), intent(in) :: mesh
719 class(hamiltonian_abst_t), intent(inout) :: hm
720 class(batch_t), intent(inout) :: psib
721 class(chebyshev_function_t), intent(in) :: chebyshev_function
722 real(real64), optional, intent(in) :: vmagnus(:,:,:) !(mesh%np, hm%d%nspin, 2)
723
724 integer :: j, order_needed
725 complex(real64) :: coefficient
726 complex(real64), allocatable :: coefficients(:)
727 real(real64) :: error
728 class(batch_t), allocatable, target :: psi0, psi1, psi2
729 class(batch_t), pointer :: psi_n, psi_n1, psi_n2
730 integer, parameter :: max_order = 200
731
733 call profiling_in("EXP_CHEBY_BATCH")
734
735 call psib%clone_to(psi0)
736 call psib%clone_to(psi1)
737 call psib%clone_to(psi2)
738 call psib%copy_data_to(mesh%np, psi0)
739
740 order_needed = max_order
741 do j = 1, max_order
742 error = chebyshev_function%get_error(j)
743 if (error > m_zero .and. error < te%chebyshev_tol) then
744 order_needed = j
745 exit
746 end if
747 end do
748
749 call chebyshev_function%get_coefficients(j, coefficients)
750
751 ! zero-order term
752 call batch_scal(mesh%np, coefficients(0), psib)
753 ! first-order term
754 ! shifted Hamiltonian
755 call operate_batch(hm, namespace, mesh, psi0, psi1, vmagnus)
756 call batch_axpy(mesh%np, -hm%spectral_middle_point, psi0, psi1)
757 call batch_scal(mesh%np, m_one/hm%spectral_half_span, psi1)
758 ! accumulate result
759 call batch_axpy(mesh%np, coefficients(1), psi1, psib)
760
761 ! use pointers to avoid copies
762 psi_n => psi2
763 psi_n1 => psi1
764 psi_n2 => psi0
765
766 do j = 2, order_needed
767 ! compute shifted Hamiltonian and Chebyshev recurrence formula
768 call operate_batch(hm, namespace, mesh, psi_n1, psi_n, vmagnus)
769 call batch_axpy(mesh%np, -hm%spectral_middle_point, psi_n1, psi_n)
770 call batch_xpay(mesh%np, psi_n2, -m_two/hm%spectral_half_span, psi_n)
771 call batch_scal(mesh%np, -m_one, psi_n)
772
773 ! accumulate result
774 call batch_axpy(mesh%np, coefficients(j), psi_n, psib)
775
776 ! shift pointers for the three-term recurrence, this avoids copies
777 if (mod(j, 3) == 2) then
778 psi_n => psi0
779 psi_n1 => psi2
780 psi_n2 => psi1
781 else if (mod(j, 3) == 1) then
782 psi_n => psi2
783 psi_n1 => psi1
784 psi_n2 => psi0
785 else
786 psi_n => psi1
787 psi_n1 => psi0
788 psi_n2 => psi2
789 end if
790 end do
791
792 if (order_needed == max_order) then
793 write(message(1),'(a,i5,a,es9.2)') 'Chebyshev exponential expansion did not converge after ', j, &
794 ' iterations. Coefficient: ', coefficient
795 call messages_warning(1, namespace=namespace)
796 else
797 write(message(1),'(a,i5)') 'Debug: Chebyshev exponential iterations: ', j
798 call messages_info(1, namespace=namespace, debug_only=.true.)
799 end if
800
801 call psi0%end()
802 call psi1%end()
803 call psi2%end()
804 safe_deallocate_a(psi0)
805 safe_deallocate_a(psi1)
806 safe_deallocate_a(psi2)
807
808 safe_deallocate_a(coefficients)
809
810 call profiling_out("EXP_CHEBY_BATCH")
811
813 end subroutine exponential_cheby_batch
814
815
816 subroutine operate_batch(hm, namespace, mesh, psib, hpsib, vmagnus)
817 class(hamiltonian_abst_t), intent(inout) :: hm
818 type(namespace_t), intent(in) :: namespace
819 class(mesh_t), intent(in) :: mesh
820 class(batch_t), intent(inout) :: psib
821 class(batch_t), intent(inout) :: hpsib
822 real(real64), optional, intent(in) :: vmagnus(:, :, :)
823
824 push_sub(operate_batch)
825
826 if (present(vmagnus)) then
827 call hm%zmagnus_apply(namespace, mesh, psib, hpsib, vmagnus)
828 else
829 call hm%zapply(namespace, mesh, psib, hpsib)
830 end if
831
832 pop_sub(operate_batch)
833 end subroutine operate_batch
834
835 ! ---------------------------------------------------------
839 subroutine exponential_apply_all(te, namespace, gr, hm, st, deltat, order)
840 type(exponential_t), intent(inout) :: te
841 type(namespace_t), intent(in) :: namespace
842 type(grid_t), intent(inout) :: gr
843 type(hamiltonian_elec_t), intent(inout) :: hm
844 type(states_elec_t), intent(inout) :: st
845 real(real64), intent(in) :: deltat
846 integer, optional, intent(inout) :: order
847
848 integer :: ik, ib, i
849 real(real64) :: zfact
850
851 type(states_elec_t) :: st1, hst1
852
853 push_sub(exponential_apply_all)
854
855 assert(te%exp_method == exp_taylor)
856
857 call states_elec_copy(st1, st)
858 call states_elec_copy(hst1, st)
859
860 zfact = m_one
861 do i = 1, te%exp_order
862 zfact = zfact * deltat / i
863
864 if (i == 1) then
865 call zhamiltonian_elec_apply_all(hm, namespace, gr, st, hst1)
866 else
867 call zhamiltonian_elec_apply_all(hm, namespace, gr, st1, hst1)
868 end if
869
870 do ik = st%d%kpt%start, st%d%kpt%end
871 do ib = st%group%block_start, st%group%block_end
872 call batch_set_zero(st1%group%psib(ib, ik))
873 call batch_axpy(gr%np, -m_zi, hst1%group%psib(ib, ik), st1%group%psib(ib, ik))
874 call batch_axpy(gr%np, zfact, st1%group%psib(ib, ik), st%group%psib(ib, ik))
875 end do
876 end do
877
878 end do
879 ! End of Taylor expansion loop.
880
881 call states_elec_end(st1)
882 call states_elec_end(hst1)
883
884 ! We now add the inhomogeneous part, if present.
885 if (hamiltonian_elec_inh_term(hm)) then
886 !write(*, *) 'Now we apply the inhomogeneous term...'
887
888 call states_elec_copy(st1, hm%inh_st)
889 call states_elec_copy(hst1, hm%inh_st)
890
891
892 do ik = st%d%kpt%start, st%d%kpt%end
893 do ib = st%group%block_start, st%group%block_end
894 call batch_axpy(gr%np, deltat, st1%group%psib(ib, ik), st%group%psib(ib, ik))
895 end do
896 end do
897
898 zfact = m_one
899 do i = 1, te%exp_order
900 zfact = zfact * deltat / (i+1)
901
902 if (i == 1) then
903 call zhamiltonian_elec_apply_all(hm, namespace, gr, hm%inh_st, hst1)
904 else
905 call zhamiltonian_elec_apply_all(hm, namespace, gr, st1, hst1)
906 end if
907
908 do ik = st%d%kpt%start, st%d%kpt%end
909 do ib = st%group%block_start, st%group%block_end
910 call batch_set_zero(st1%group%psib(ib, ik))
911 call batch_axpy(gr%np, -m_zi, hst1%group%psib(ib, ik), st1%group%psib(ib, ik))
912 call batch_axpy(gr%np, deltat * zfact, st1%group%psib(ib, ik), st%group%psib(ib, ik))
913 end do
914 end do
915
916 end do
917
918 call states_elec_end(st1)
919 call states_elec_end(hst1)
920
921 end if
922
923 if (present(order)) order = te%exp_order*st%nik*st%nst ! This should be the correct number
924
925 pop_sub(exponential_apply_all)
926 end subroutine exponential_apply_all
927
928 subroutine exponential_apply_phi_batch(te, namespace, mesh, hm, psib, deltat, k, vmagnus)
929 class(exponential_t), intent(inout) :: te
930 type(namespace_t), intent(in) :: namespace
931 class(mesh_t), intent(in) :: mesh
932 class(hamiltonian_abst_t), intent(inout) :: hm
933 class(batch_t), intent(inout) :: psib
934 real(real64), intent(in) :: deltat
935 integer, intent(in) :: k
936 real(real64), optional, intent(in) :: vmagnus(:,:,:) !(mesh%np, hm%d%nspin, 2)
937
938 class(chebyshev_function_t), pointer :: chebyshev_function
939 complex(real64) :: deltat_
940
941 push_sub_with_profile(exponential_apply_phi_batch)
942
943 assert(psib%type() == type_cmplx)
944 if (present(vmagnus)) then
945 select type(hm)
946 class is (hamiltonian_elec_t)
947 assert(size(vmagnus, 1) >= mesh%np)
948 assert(size(vmagnus, 2) == hm%d%nspin)
949 assert(size(vmagnus, 3) == 2)
950 class default
951 write(message(1), '(a)') 'Magnus operators only implemented for electrons at the moment'
952 call messages_fatal(1, namespace=namespace)
953 end select
954 end if
955
956 if (.not. hm%is_hermitian() .and. te%exp_method == exp_chebyshev) then
957 write(message(1), '(a)') 'The Chebyshev expansion for the exponential will only converge if the imaginary'
958 write(message(2), '(a)') 'eigenvalues are small enough compared to the span of the real eigenvalues,'
959 write(message(3), '(a)') 'i.e., for ratios smaller than about 1e-3.'
960 write(message(4), '(a)') 'The Lanczos method ("TDExponentialMethod = lanczos") is guaranteed to'
961 write(message(5), '(a)') 'always converge in this case.'
962 call messages_warning(5, namespace=namespace)
963 end if
964
965 deltat_ = cmplx(deltat, m_zero, real64)
966
967 select case (te%exp_method)
968 case (exp_taylor)
969 call exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat_, vmagnus=vmagnus, phik_shift=k)
970
971 case (exp_lanczos)
972 if (k == 1) then
973 call exponential_lanczos_function_batch(te, namespace, mesh, hm, psib, deltat_, phi1, vmagnus)
974 else if (k == 2) then
975 call exponential_lanczos_function_batch(te, namespace, mesh, hm, psib, deltat_, phi2, vmagnus)
976 else
977 write(message(1), '(a)') 'Lanczos expansion not implemented for phi_k, k > 2'
978 call messages_fatal(1, namespace=namespace)
979 end if
980
981 case (exp_chebyshev)
982 if (k == 1) then
983 chebyshev_function => chebyshev_numerical_t(hm%spectral_half_span, hm%spectral_middle_point, deltat, phi1)
984 else if (k == 2) then
985 chebyshev_function => chebyshev_numerical_t(hm%spectral_half_span, hm%spectral_middle_point, deltat, phi2)
986 else
987 write(message(1), '(a)') 'Chebyshev expansion not implemented for phi_k, k > 2'
988 call messages_fatal(1, namespace=namespace)
989 end if
990 call exponential_cheby_batch(te, namespace, mesh, hm, psib, chebyshev_function, vmagnus)
991 deallocate(chebyshev_function)
992
993 end select
994 pop_sub_with_profile(exponential_apply_phi_batch)
995 end subroutine exponential_apply_phi_batch
996
997end module exponential_oct_m
998
999!! Local Variables:
1000!! mode: f90
1001!! coding: utf-8
1002!! End:
batchified version of the BLAS axpy routine:
Definition: batch_ops.F90:156
scale a batch by a constant or vector
Definition: batch_ops.F90:164
batchified version of
Definition: batch_ops.F90:172
subroutine, public accel_finish()
Definition: accel.F90:985
This module implements batches of mesh functions.
Definition: batch.F90:135
This module implements common operations on batches of mesh functions.
Definition: batch_ops.F90:118
subroutine, public batch_set_zero(this, np, async)
fill all mesh functions of the batch with zero
Definition: batch_ops.F90:244
This module contains interfaces for BLAS routines You should not use these routines directly....
Definition: blas.F90:120
subroutine, public exponential_copy(teo, tei)
subroutine, public exponential_apply_all(te, namespace, gr, hm, st, deltat, order)
Note that this routine not only computes the exponential, but also an extra term if there is a inhomo...
subroutine operate_batch(hm, namespace, mesh, psib, hpsib, vmagnus)
subroutine exponential_apply_phi_batch(te, namespace, mesh, hm, psib, deltat, k, vmagnus)
subroutine exponential_cheby_batch(te, namespace, mesh, hm, psib, chebyshev_function, vmagnus)
Calculates the exponential of the Hamiltonian through an expansion in Chebyshev polynomials.
subroutine, public exponential_init(te, namespace, full_batch)
integer, parameter, public exp_taylor
subroutine exponential_lanczos_batch(te, namespace, mesh, hm, psib, deltat, vmagnus, inh_psib)
subroutine exponential_apply_single(te, namespace, mesh, hm, zpsi, ist, ik, deltat, vmagnus, imag_time)
Wrapper to batchified routine for applying exponential to an array.
subroutine exponential_taylor_series_batch(te, namespace, mesh, hm, psib, deltat, psib2, deltat2, vmagnus, inh_psib, phik_shift)
subroutine exponential_apply_batch(te, namespace, mesh, hm, psib, deltat, psib2, deltat2, vmagnus, imag_time, inh_psib)
This routine performs the operation:
subroutine, public exponential_lanczos_function_batch(te, namespace, mesh, hm, psib, deltat, fun, vmagnus)
Compute fun(H) psib, i.e. the application of a function of the Hamiltonian to a batch.
integer, parameter, public exp_chebyshev
real(real64), parameter, public m_two
Definition: global.F90:193
real(real64), parameter, public m_zero
Definition: global.F90:191
complex(real64), parameter, public m_zi
Definition: global.F90:205
real(real64), parameter, public m_epsilon
Definition: global.F90:207
complex(real64), parameter, public m_z1
Definition: global.F90:202
real(real64), parameter, public m_one
Definition: global.F90:192
This module implements the underlying real-space grid.
Definition: grid.F90:119
This module defines an abstract class for Hamiltonians.
subroutine, public zhamiltonian_elec_apply_all(hm, namespace, gr, st, hst)
pure logical function, public hamiltonian_elec_inh_term(hm)
subroutine, public zlalg_matrix_function(n, factor, a, fun_a, fun, hermitian, tridiagonal)
This routine calculates a function of a matrix by using an eigenvalue decomposition.
Definition: lalg_adv.F90:2152
This module is intended to contain "only mathematical" functions and procedures.
Definition: math.F90:117
complex(real64) pure function, public phi2(z)
Compute phi2(z) = (phi1(z)-1)/z = (exp(z) - z - 1)/z^2.
Definition: math.F90:976
complex(real64) pure function, public exponential(z)
Wrapper for exponential.
Definition: math.F90:945
complex(real64) pure function, public phi1(z)
Compute phi1(z) = (exp(z)-1)/z.
Definition: math.F90:957
This module defines functions over batches of mesh functions.
Definition: mesh_batch.F90:118
subroutine, public mesh_batch_nrm2(mesh, aa, nrm2, reduce)
Calculate the norms (norm2, not the square!) of a batch of mesh functions.
Definition: mesh_batch.F90:179
subroutine, public zmesh_batch_orthogonalization(mesh, nst, psib, phib, normalize, overlap, norm, gs_scheme, full_batch)
Orthonormalizes states of phib to the orbitals of nst batches of psi.
This module defines various routines, operating on mesh functions.
This module defines the meshes, which are used in Octopus.
Definition: mesh.F90:120
subroutine, public messages_warning(no_lines, all_nodes, namespace)
Definition: messages.F90:525
character(len=256), dimension(max_lines), public message
to be output by fatal, warning
Definition: messages.F90:162
subroutine, public messages_fatal(no_lines, only_root_writes, namespace)
Definition: messages.F90:410
subroutine, public messages_input_error(namespace, var, details, row, column)
Definition: messages.F90:691
subroutine, public messages_info(no_lines, iunit, debug_only, stress, all_nodes, namespace)
Definition: messages.F90:594
logical function, public parse_is_defined(namespace, name)
Definition: parser.F90:455
subroutine, public profiling_out(label)
Increment out counter and sum up difference between entry and exit time.
Definition: profiling.F90:625
subroutine, public profiling_in(label, exclude)
Increment in counter and save entry time.
Definition: profiling.F90:554
subroutine, public states_elec_end(st)
finalize the states_elec_t object
subroutine, public states_elec_copy(stout, stin, exclude_wfns, exclude_eigenval, special)
make a (selective) copy of a states_elec_t object
type(type_t), public type_cmplx
Definition: types.F90:136
Definition: xc.F90:116
Class defining batches of mesh functions.
Definition: batch.F90:161
Description of the grid, containing information on derivatives, stencil, and symmetries.
Definition: grid.F90:171
The abstract Hamiltonian class defines a skeleton for specific implementations.
Describes mesh distribution to nodes.
Definition: mesh.F90:187
The states_elec_t class contains all electronic wave functions.
batches of electronic states
Definition: wfs_elec.F90:141
int true(void)