Anasazi Version of the Day
Loading...
Searching...
No Matches
mypdsaup2.f
1c @HEADER
2c *****************************************************************************
3c Anasazi: Block Eigensolvers Package
4c
5c Copyright 2004 NTESS and the Anasazi contributors.
6c SPDX-License-Identifier: BSD-3-Clause
7c *****************************************************************************
8c @HEADER
9
10c This software is a result of the research described in the report
11c
12c "A comparison of algorithms for modal analysis in the absence
13c of a sparse direct method", P. Arbenz, R. Lehoucq, and U. Hetmaniuk,
14c Sandia National Laboratories, Technical report SAND2003-1028J.
15c
16c It is based on the Epetra, AztecOO, and ML packages defined in the Trilinos
17c framework ( http://trilinos.org/ ).
18c-----------------------------------------------------------------------
19c\BeginDoc
20c
21c\Name: mypdsaup2
22c
23c Message Passing Layer: MPI
24c
25c\Description:
26c Intermediate level interface called by pdsaupd.
27c
28c\Usage:
29c call pdsaup2
30c ( COMM, IDO, BMAT, N, WHICH, NEV, NP, TOL, RESID, MODE, IUPD,
31c ISHIFT, MXITER, V, LDV, H, LDH, RITZ, BOUNDS, Q, LDQ, WORKL,
32c IPNTR, WORKD, INFO, VERBOSE, USRCHK, UNCONV )
33c
34c\Arguments
35c
36c COMM, IDO, BMAT, N, WHICH, NEV, TOL, RESID: same as defined in pdsaupd.
37c MODE, ISHIFT, MXITER: see the definition of IPARAM in pdsaupd.
38c
39c NP Integer. (INPUT/OUTPUT)
40c Contains the number of implicit shifts to apply during
41c each Arnoldi/Lanczos iteration.
42c If ISHIFT=1, NP is adjusted dynamically at each iteration
43c to accelerate convergence and prevent stagnation.
44c This is also roughly equal to the number of matrix-vector
45c products (involving the operator OP) per Arnoldi iteration.
46c The logic for adjusting is contained within the current
47c subroutine.
48c If ISHIFT=0, NP is the number of shifts the user needs
49c to provide via reverse comunication. 0 < NP < NCV-NEV.
50c NP may be less than NCV-NEV since a leading block of the current
51c upper Tridiagonal matrix has split off and contains "unwanted"
52c Ritz values.
53c Upon termination of the IRA iteration, NP contains the number
54c of "converged" wanted Ritz values.
55c
56c IUPD Integer. (INPUT)
57c IUPD .EQ. 0: use explicit restart instead implicit update.
58c IUPD .NE. 0: use implicit update.
59c
60c V Double precision N by (NEV+NP) array. (INPUT/OUTPUT)
61c The Lanczos basis vectors.
62c
63c LDV Integer. (INPUT)
64c Leading dimension of V exactly as declared in the calling
65c program.
66c
67c H Double precision (NEV+NP) by 2 array. (OUTPUT)
68c H is used to store the generated symmetric tridiagonal matrix
69c The subdiagonal is stored in the first column of H starting
70c at H(2,1). The main diagonal is stored in the second column
71c of H starting at H(1,2). If pdsaup2 converges store the
72c B-norm of the final residual vector in H(1,1).
73c
74c LDH Integer. (INPUT)
75c Leading dimension of H exactly as declared in the calling
76c program.
77c
78c RITZ Double precision array of length NEV+NP. (OUTPUT)
79c RITZ(1:NEV) contains the computed Ritz values of OP.
80c
81c BOUNDS Double precision array of length NEV+NP. (OUTPUT)
82c BOUNDS(1:NEV) contain the error bounds corresponding to RITZ.
83c
84c Q Double precision (NEV+NP) by (NEV+NP) array. (WORKSPACE)
85c Private (replicated) work array used to accumulate the
86c rotation in the shift application step.
87c
88c LDQ Integer. (INPUT)
89c Leading dimension of Q exactly as declared in the calling
90c program.
91c
92c WORKL Double precision array of length at least 3*(NEV+NP). (INPUT/WORKSPACE)
93c Private (replicated) array on each PE or array allocated on
94c the front end. It is used in the computation of the
95c tridiagonal eigenvalue problem, the calculation and
96c application of the shifts and convergence checking.
97c If ISHIFT .EQ. O and IDO .EQ. 3, the first NP locations
98c of WORKL are used in reverse communication to hold the user
99c supplied shifts.
100c
101c IPNTR Integer array of length 3. (OUTPUT)
102c Pointer to mark the starting locations in the WORKD for
103c vectors used by the Lanczos iteration.
104c -------------------------------------------------------------
105c IPNTR(1): pointer to the current operand vector X.
106c IPNTR(2): pointer to the current result vector Y.
107c IPNTR(3): pointer to the vector B * X when used in one of
108c the spectral transformation modes. X is the current
109c operand.
110c -------------------------------------------------------------
111c
112c WORKD Double precision work array of length 3*N. (REVERSE COMMUNICATION)
113c Distributed array to be used in the basic Lanczos iteration
114c for reverse communication. The user should not use WORKD
115c as temporary workspace during the iteration !!!!!!!!!!
116c See Data Distribution Note in pdsaupd.
117c
118c INFO Integer. (INPUT/OUTPUT)
119c If INFO .EQ. 0, a randomly initial residual vector is used.
120c If INFO .NE. 0, RESID contains the initial residual vector,
121c possibly from a previous run.
122c Error flag on output.
123c = 0: Normal return.
124c = 1: All possible eigenvalues of OP has been found.
125c NP returns the size of the invariant subspace
126c spanning the operator OP.
127c = 2: No shifts could be applied.
128c = -8: Error return from trid. eigenvalue calculation;
129c This should never happen.
130c = -9: Starting vector is zero.
131c = -9999: Could not build an Lanczos factorization.
132c Size that was built in returned in NP.
133c
134c VERBOSE Flag to print out the status of computation per iteration (or
135c equivalently restart).
136c
137c USRCHK Integer. (INPUT) A nonzero value implies that reverse
138c communication is done so that the user can check convergence.
139c
140c UNCONV Integer. (INPUT) Only used if USRCHK is not equal to zero.
141c
142c
143c\EndDoc
144c
145c-----------------------------------------------------------------------
146c
147c\BeginLib
148c
149c\References:
150c 1. D.C. Sorensen, "Implicit Application of Polynomial Filters in
151c a k-Step Arnoldi Method", SIAM J. Matr. Anal. Apps., 13 (1992),
152c pp 357-385.
153c 2. R.B. Lehoucq, "Analysis and Implementation of an Implicitly
154c Restarted Arnoldi Iteration", Rice University Technical Report
155c TR95-13, Department of Computational and Applied Mathematics.
156c 3. B.N. Parlett, "The Symmetric Eigenvalue Problem". Prentice-Hall,
157c 1980.
158c 4. B.N. Parlett, B. Nour-Omid, "Towards a Black Box Lanczos Program",
159c Computer Physics Communications, 53 (1989), pp 169-179.
160c 5. B. Nour-Omid, B.N. Parlett, T. Ericson, P.S. Jensen, "How to
161c Implement the Spectral Transformation", Math. Comp., 48 (1987),
162c pp 663-673.
163c 6. R.G. Grimes, J.G. Lewis and H.D. Simon, "A Shifted Block Lanczos
164c Algorithm for Solving Sparse Symmetric Generalized Eigenproblems",
165c SIAM J. Matr. Anal. Apps., January (1993).
166c 7. L. Reichel, W.B. Gragg, "Algorithm 686: FORTRAN Subroutines
167c for Updating the QR decomposition", ACM TOMS, December 1990,
168c Volume 16 Number 4, pp 369-377.
169c
170c\Routines called:
171c pdgetv0 Parallel ARPACK initial vector generation routine.
172c pdsaitr Parallel ARPACK Lanczos factorization routine.
173c pdsapps Parallel ARPACK application of implicit shifts routine.
174c dsconv ARPACK convergence of Ritz values routine.
175c pdseigt Parallel ARPACK compute Ritz values and error bounds routine.
176c pdsgets Parallel ARPACK reorder Ritz values and error bounds routine.
177c dsortr ARPACK sorting routine.
178c sstrqb ARPACK routine that computes all eigenvalues and the
179c last component of the eigenvectors of a symmetric
180c tridiagonal matrix using the implicit QL or QR method.
181c pivout Parallel ARPACK utility routine that prints integers.
182c second ARPACK utility routine for timing.
183c pdvout Parallel ARPACK utility routine that prints vectors.
184c pdlamch ScaLAPACK routine that determines machine constants.
185c dcopy Level 1 BLAS that copies one vector to another.
186c ddot Level 1 BLAS that computes the scalar product of two vectors.
187c pdnorm2 Parallel version of Level 1 BLAS that computes the norm of a vector.
188c dscal Level 1 BLAS that scales a vector.
189c dswap Level 1 BLAS that swaps two vectors.
190c
191c\Author
192c Danny Sorensen Phuong Vu
193c Richard Lehoucq CRPC / Rice University
194c Dept. of Computational & Houston, Texas
195c Applied Mathematics
196c Rice University
197c Houston, Texas
198c
199c\Parallel Modifications
200c Kristi Maschhoff
201c
202c\Revision history:
203c Starting Point: Serial Code FILE: saup2.F SID: 2.4
204c
205c\SCCS Information:
206c FILE: saup2.F SID: 1.5 DATE OF SID: 05/20/98
207c
208c\EndLib
209c
210c-----------------------------------------------------------------------
211c
212 subroutine mypdsaup2
213 & ( comm, ido, bmat, n, which, nev, np, tol, resid, mode, iupd,
214 & ishift, mxiter, v, ldv, h, ldh, ritz, bounds,
215 & q, ldq, workl, ipntr, workd, info, verbose, usrchk, unconv )
216c
217 include 'mpif.h'
218c
219c %---------------%
220c | MPI Variables |
221c %---------------%
222c
223 integer comm
224c
225c
226c %----------------------------------------------------%
227c | Include files for debugging and timing information |
228c %----------------------------------------------------%
229c
230 include 'debug.h'
231 include 'stat.h'
232c
233c %------------------%
234c | Scalar Arguments |
235c %------------------%
236c
237 character bmat*1, which*2
238 integer ido, info, ishift, iupd, ldh, ldq, ldv, mxiter,
239 & n, mode, nev, np
240 integer verbose, usrchk, unconv
241 double precision
242 & tol
243c
244c %-----------------%
245c | Array Arguments |
246c %-----------------%
247c
248 integer ipntr(3)
249 double precision
250 & bounds(nev+np), h(ldh,2), q(ldq,nev+np), resid(n),
251 & ritz(nev+np), v(ldv,nev+np), workd(3*n),
252 & workl(3*(nev+np))
253c
254c %------------%
255c | Parameters |
256c %------------%
257c
258 double precision
259 & one, zero
260 parameter(one = 1.0, zero = 0.0)
261c
262c %---------------%
263c | Local Scalars |
264c %---------------%
265c
266 character wprime*2
267 logical cnorm, getv0, initv, update, ushift
268 logical cnvchk
269 integer ierr, iter, j, kplusp, msglvl, nconv, nevbef, nev0,
270 & np0, nptemp, nevd2, nevm2, kp(3)
271 double precision
272 & rnorm, temp, eps23
273 save cnorm, getv0, initv, update, ushift,
274 & iter, kplusp, msglvl, nconv, nev0, np0,
275 & rnorm, eps23, cnvchk
276c
277 double precision
278 & rnorm_buf
279c
280c
281c %----------------------%
282c | External Subroutines |
283c %----------------------%
284c
285 external dcopy, pdgetv0, pdsaitr, dscal, dsconv,
286 & pdseigt, pdsgets, pdsapps,
287 & dsortr, pdvout, pivout, second
288c
289c %--------------------%
290c | External Functions |
291c %--------------------%
292c
293 double precision
294 & ddot, pdnorm2, pdlamch
295 external ddot, pdnorm2, pdlamch
296c
297c %---------------------%
298c | Intrinsic Functions |
299c %---------------------%
300c
301 intrinsic min, abs, sqrt
302c
303c %-----------------------%
304c | Executable Statements |
305c %-----------------------%
306c
307 if (ido .eq. 0) then
308c
309c %-------------------------------%
310c | Initialize timing statistics |
311c | & message level for debugging |
312c %-------------------------------%
313c
314 call second (t0)
315 msglvl = msaup2
316c
317c %---------------------------------%
318c | Set machine dependent constant. |
319c %---------------------------------%
320c
321 eps23 = pdlamch(comm, 'Epsilon-Machine')
322 eps23 = eps23**(2.0/3.0)
323c
324c %-------------------------------------%
325c | nev0 and np0 are integer variables |
326c | hold the initial values of NEV & NP |
327c %-------------------------------------%
328c
329 nev0 = nev
330 np0 = np
331c
332c %-------------------------------------%
333c | kplusp is the bound on the largest |
334c | Lanczos factorization built. |
335c | nconv is the current number of |
336c | "converged" eigenvlues. |
337c | iter is the counter on the current |
338c | iteration step. |
339c %-------------------------------------%
340c
341 kplusp = nev0 + np0
342 nconv = 0
343 iter = 0
344c
345c %---------------------------------------------%
346c | Set flags for computing the first NEV steps |
347c | of the Lanczos factorization. |
348c %---------------------------------------------%
349c
350 getv0 = .true.
351 update = .false.
352 ushift = .false.
353 cnorm = .false.
354 cnvchk = .false.
355c
356 if (info .ne. 0) then
357c
358c %--------------------------------------------%
359c | User provides the initial residual vector. |
360c %--------------------------------------------%
361c
362 initv = .true.
363 info = 0
364 else
365 initv = .false.
366 end if
367 end if
368c
369c %---------------------------------------------%
370c | Get a possibly random starting vector and |
371c | force it into the range of the operator OP. |
372c %---------------------------------------------%
373c
374 10 continue
375c
376 if (getv0) then
377 call pdgetv0 ( comm, ido, bmat, 1, initv, n, 1, v, ldv,
378 & resid, rnorm, ipntr, workd, workl, info)
379c
380 if (ido .ne. 99) go to 9000
381c
382 if (rnorm .eq. zero) then
383c
384c %-----------------------------------------%
385c | The initial vector is zero. Error exit. |
386c %-----------------------------------------%
387c
388 info = -9
389 go to 1200
390 end if
391 getv0 = .false.
392 ido = 0
393 end if
394c
395c %------------------------------------------------------------%
396c | Back from reverse communication: continue with update step |
397c %------------------------------------------------------------%
398c
399 if (update) go to 20
400c
401c %-------------------------------------------%
402c | Back from computing user specified shifts |
403c %-------------------------------------------%
404c
405 if (ushift) go to 50
406c
407c %-------------------------------------%
408c | Back from computing residual norm |
409c | at the end of the current iteration |
410c %-------------------------------------%
411c
412 if (cnorm) go to 100
413c
414c %---------------------------------%
415c | Back from checking convergence. |
416c %---------------------------------%
417c
418 if (cnvchk) go to 110
419c
420c %----------------------------------------------------------%
421c | Compute the first NEV steps of the Lanczos factorization |
422c %----------------------------------------------------------%
423c
424 call pdsaitr (comm, ido, bmat, n, 0, nev0, mode,
425 & resid, rnorm, v, ldv, h, ldh, ipntr,
426 & workd, workl, info)
427c
428c %---------------------------------------------------%
429c | ido .ne. 99 implies use of reverse communication |
430c | to compute operations involving OP and possibly B |
431c %---------------------------------------------------%
432c
433 if (ido .ne. 99) go to 9000
434c
435 if (info .gt. 0) then
436c
437c %-----------------------------------------------------%
438c | pdsaitr was unable to build an Lanczos factorization|
439c | of length NEV0. INFO is returned with the size of |
440c | the factorization built. Exit main loop. |
441c %-----------------------------------------------------%
442c
443 np = info
444 mxiter = iter
445 info = -9999
446 go to 1200
447 end if
448c
449c %--------------------------------------------------------------%
450c | |
451c | M A I N LANCZOS I T E R A T I O N L O O P |
452c | Each iteration implicitly restarts the Lanczos |
453c | factorization in place. |
454c | |
455c %--------------------------------------------------------------%
456c
457 1000 continue
458c
459 iter = iter + 1
460c
461 if (verbose .gt. 0) then
462 write(*,*)' Iteration ',iter,
463 & ' - Number of converged eigenvalues ',nconv
464 end if
465c
466 if (msglvl .gt. 0) then
467 call pivout (comm, logfil, 1, iter, ndigit,
468 & '_saup2: **** Start of major iteration number ****')
469 end if
470 if (msglvl .gt. 1) then
471 call pivout (comm, logfil, 1, nev, ndigit,
472 & '_saup2: The length of the current Lanczos factorization')
473 call pivout (comm, logfil, 1, np, ndigit,
474 & '_saup2: Extend the Lanczos factorization by')
475 end if
476c
477c %------------------------------------------------------------%
478c | Compute NP additional steps of the Lanczos factorization. |
479c %------------------------------------------------------------%
480c
481 ido = 0
482 20 continue
483 update = .true.
484c
485 call pdsaitr (comm, ido, bmat, n, nev, np, mode,
486 & resid, rnorm, v, ldv, h, ldh, ipntr,
487 & workd, workl, info)
488c
489c %---------------------------------------------------%
490c | ido .ne. 99 implies use of reverse communication |
491c | to compute operations involving OP and possibly B |
492c %---------------------------------------------------%
493c
494 if (ido .ne. 99) go to 9000
495c
496 if (info .gt. 0) then
497c
498c %-----------------------------------------------------%
499c | pdsaitr was unable to build an Lanczos factorization|
500c | of length NEV0+NP0. INFO is returned with the size |
501c | of the factorization built. Exit main loop. |
502c %-----------------------------------------------------%
503c
504 np = info
505 mxiter = iter
506 info = -9999
507 go to 1200
508 end if
509 update = .false.
510c
511 if (msglvl .gt. 1) then
512 call pdvout (comm, logfil, 1, rnorm, ndigit,
513 & '_saup2: Current B-norm of residual for factorization')
514 end if
515c
516c %--------------------------------------------------------%
517c | Compute the eigenvalues and corresponding error bounds |
518c | of the current symmetric tridiagonal matrix. |
519c %--------------------------------------------------------%
520c
521 call pdseigt ( comm, rnorm, kplusp, h, ldh, ritz, bounds,
522 & workl, ierr)
523c
524 if (ierr .ne. 0) then
525 info = -8
526 go to 1200
527 end if
528c
529c %----------------------------------------------------%
530c | Make a copy of eigenvalues and corresponding error |
531c | bounds obtained from _seigt. |
532c %----------------------------------------------------%
533c
534 call dcopy(kplusp, ritz, 1, workl(kplusp+1), 1)
535 call dcopy(kplusp, bounds, 1, workl(2*kplusp+1), 1)
536c
537c %---------------------------------------------------%
538c | Select the wanted Ritz values and their bounds |
539c | to be used in the convergence test. |
540c | The selection is based on the requested number of |
541c | eigenvalues instead of the current NEV and NP to |
542c | prevent possible misconvergence. |
543c | * Wanted Ritz values := RITZ(NP+1:NEV+NP) |
544c | * Shifts := RITZ(1:NP) := WORKL(1:NP) |
545c %---------------------------------------------------%
546c
547 nev = nev0
548 np = np0
549 call pdsgets ( comm, ishift, which, nev, np, ritz,
550 & bounds, workl)
551c
552c %-------------------%
553c | Convergence test. |
554c %-------------------%
555c
556 call dcopy (nev, bounds(np+1), 1, workl(np+1), 1)
557 if (usrchk .eq. zero) then
558 call dsconv (nev, ritz(np+1), workl(np+1), tol, nconv)
559 else
560c
561c %--------------------------------------------------%
562c | reverse communication so that the user can check |
563c | convergence. |
564c %--------------------------------------------------%
565c
566 cnvchk = .true.
567 ido = 4
568 go to 9000
569 end if
570c
571 110 continue
572c
573c %------------------------------------%
574c | Back from reverse communication; |
575c %------------------------------------%
576c
577 if (usrchk .ne. zero) then
578 nconv = unconv
579 cnvchk = .false.
580 end if
581c
582 if (msglvl .gt. 2) then
583 kp(1) = nev
584 kp(2) = np
585 kp(3) = nconv
586 call pivout (comm, logfil, 3, kp, ndigit,
587 & '_saup2: NEV, NP, NCONV are')
588 call pdvout (comm, logfil, kplusp, ritz, ndigit,
589 & '_saup2: The eigenvalues of H')
590 call pdvout (comm, logfil, kplusp, bounds, ndigit,
591 & '_saup2: Ritz estimates of the current NCV Ritz values')
592 end if
593c
594c %---------------------------------------------------------%
595c | Count the number of unwanted Ritz values that have zero |
596c | Ritz estimates. If any Ritz estimates are equal to zero |
597c | then a leading block of H of order equal to at least |
598c | the number of Ritz values with zero Ritz estimates has |
599c | split off. None of these Ritz values may be removed by |
600c | shifting. Decrease NP the number of shifts to apply. If |
601c | no shifts may be applied, then prepare to exit |
602c %---------------------------------------------------------%
603c
604 nptemp = np
605 do 30 j=1, nptemp
606 if (bounds(j) .eq. zero) then
607 np = np - 1
608 nev = nev + 1
609 end if
610 30 continue
611c
612 if ( (nconv .ge. nev0) .or.
613 & (iter .gt. mxiter) .or.
614 & (np .eq. 0) ) then
615c
616c %------------------------------------------------%
617c | Prepare to exit. Put the converged Ritz values |
618c | and corresponding bounds in RITZ(1:NCONV) and |
619c | BOUNDS(1:NCONV) respectively. Then sort. Be |
620c | careful when NCONV > NP since we don't want to |
621c | swap overlapping locations. |
622c %------------------------------------------------%
623c
624 if (which .eq. 'BE') then
625c
626c %-----------------------------------------------------%
627c | Both ends of the spectrum are requested. |
628c | Sort the eigenvalues into algebraically decreasing |
629c | order first then swap low end of the spectrum next |
630c | to high end in appropriate locations. |
631c | NOTE: when np < floor(nev/2) be careful not to swap |
632c | overlapping locations. |
633c %-----------------------------------------------------%
634c
635 wprime = 'SA'
636 call dsortr (wprime, .true., kplusp, ritz, bounds)
637 nevd2 = nev0 / 2
638 nevm2 = nev0 - nevd2
639 if ( nev .gt. 1 ) then
640 call dswap ( min(nevd2,np), ritz(nevm2+1), 1,
641 & ritz( max(kplusp-nevd2+1,kplusp-np+1) ), 1)
642 call dswap ( min(nevd2,np), bounds(nevm2+1), 1,
643 & bounds( max(kplusp-nevd2+1,kplusp-np+1)), 1)
644 end if
645c
646 else
647c
648c %--------------------------------------------------%
649c | LM, SM, LA, SA case. |
650c | Sort the eigenvalues of H into the an order that |
651c | is opposite to WHICH, and apply the resulting |
652c | order to BOUNDS. The eigenvalues are sorted so |
653c | that the wanted part are always within the first |
654c | NEV locations. |
655c %--------------------------------------------------%
656c
657 if (which .eq. 'LM') wprime = 'SM'
658 if (which .eq. 'SM') wprime = 'LM'
659 if (which .eq. 'LA') wprime = 'SA'
660 if (which .eq. 'SA') wprime = 'LA'
661c
662 call dsortr (wprime, .true., kplusp, ritz, bounds)
663c
664 end if
665c
666c %--------------------------------------------------%
667c | Scale the Ritz estimate of each Ritz value |
668c | by 1 / max(eps23,magnitude of the Ritz value). |
669c %--------------------------------------------------%
670c
671 do 35 j = 1, nev0
672 temp = max( eps23, abs(ritz(j)) )
673 bounds(j) = bounds(j)/temp
674 35 continue
675c
676c %----------------------------------------------------%
677c | Sort the Ritz values according to the scaled Ritz |
678c | esitmates. This will push all the converged ones |
679c | towards the front of ritzr, ritzi, bounds |
680c | (in the case when NCONV < NEV.) |
681c %----------------------------------------------------%
682c
683 wprime = 'LA'
684 call dsortr(wprime, .true., nev0, bounds, ritz)
685c
686c %----------------------------------------------%
687c | Scale the Ritz estimate back to its original |
688c | value. |
689c %----------------------------------------------%
690c
691 do 40 j = 1, nev0
692 temp = max( eps23, abs(ritz(j)) )
693 bounds(j) = bounds(j)*temp
694 40 continue
695c
696c %--------------------------------------------------%
697c | Sort the "converged" Ritz values again so that |
698c | the "threshold" values and their associated Ritz |
699c | estimates appear at the appropriate position in |
700c | ritz and bound. |
701c %--------------------------------------------------%
702c
703 if (which .eq. 'BE') then
704c
705c %------------------------------------------------%
706c | Sort the "converged" Ritz values in increasing |
707c | order. The "threshold" values are in the |
708c | middle. |
709c %------------------------------------------------%
710c
711 wprime = 'LA'
712 call dsortr(wprime, .true., nconv, ritz, bounds)
713c
714 else
715c
716c %----------------------------------------------%
717c | In LM, SM, LA, SA case, sort the "converged" |
718c | Ritz values according to WHICH so that the |
719c | "threshold" value appears at the front of |
720c | ritz. |
721c %----------------------------------------------%
722
723 call dsortr(which, .true., nconv, ritz, bounds)
724c
725 end if
726c
727c %------------------------------------------%
728c | Use h( 1,1 ) as storage to communicate |
729c | rnorm to _seupd if needed |
730c %------------------------------------------%
731c
732 h(1,1) = rnorm
733c
734 if (msglvl .gt. 1) then
735 call pdvout (comm, logfil, kplusp, ritz, ndigit,
736 & '_saup2: Sorted Ritz values.')
737 call pdvout (comm, logfil, kplusp, bounds, ndigit,
738 & '_saup2: Sorted ritz estimates.')
739 end if
740c
741c %------------------------------------%
742c | Max iterations have been exceeded. |
743c %------------------------------------%
744c
745 if (iter .gt. mxiter .and. nconv .lt. nev) info = 1
746c
747c %---------------------%
748c | No shifts to apply. |
749c %---------------------%
750c
751 if (np .eq. 0 .and. nconv .lt. nev0) info = 2
752c
753 np = nconv
754 go to 1100
755c
756 else if (nconv .lt. nev .and. ishift .eq. 1) then
757c
758c %---------------------------------------------------%
759c | Do not have all the requested eigenvalues yet. |
760c | To prevent possible stagnation, adjust the number |
761c | of Ritz values and the shifts. |
762c %---------------------------------------------------%
763c
764 nevbef = nev
765 nev = nev + min(nconv, np/2)
766 if (nev .eq. 1 .and. kplusp .ge. 6) then
767 nev = kplusp / 2
768 else if (nev .eq. 1 .and. kplusp .gt. 2) then
769 nev = 2
770 end if
771 np = kplusp - nev
772c
773c %---------------------------------------%
774c | If the size of NEV was just increased |
775c | resort the eigenvalues. |
776c %---------------------------------------%
777c
778 if (nevbef .lt. nev)
779 & call pdsgets ( comm, ishift, which, nev, np,
780 & ritz, bounds, workl)
781c
782 end if
783c
784 if (msglvl .gt. 0) then
785 call pivout (comm, logfil, 1, nconv, ndigit,
786 & '_saup2: no. of "converged" Ritz values at this iter.')
787 if (msglvl .gt. 1) then
788 kp(1) = nev
789 kp(2) = np
790 call pivout (comm, logfil, 2, kp, ndigit,
791 & '_saup2: NEV and NP .')
792 call pdvout (comm, logfil, nev, ritz(np+1), ndigit,
793 & '_saup2: "wanted" Ritz values.')
794 call pdvout (comm, logfil, nev, bounds(np+1), ndigit,
795 & '_saup2: Ritz estimates of the "wanted" values ')
796 end if
797 end if
798c
799 if (ishift .eq. 0) then
800c
801c %-----------------------------------------------------%
802c | User specified shifts: reverse communication to |
803c | compute the shifts. They are returned in the first |
804c | NP locations of WORKL. |
805c %-----------------------------------------------------%
806c
807 ushift = .true.
808 ido = 3
809 go to 9000
810 end if
811c
812 50 continue
813c
814c %------------------------------------%
815c | Back from reverse communication; |
816c | User specified shifts are returned |
817c | in WORKL(1:NP) |
818c %------------------------------------%
819c
820 ushift = .false.
821c
822c
823c %---------------------------------------------------------%
824c | Move the NP shifts to the first NP locations of RITZ to |
825c | free up WORKL. This is for the non-exact shift case; |
826c | in the exact shift case, pdsgets already handles this. |
827c %---------------------------------------------------------%
828c
829 if (ishift .eq. 0) call dcopy (np, workl, 1, ritz, 1)
830c
831 if (msglvl .gt. 2) then
832 call pivout (comm, logfil, 1, np, ndigit,
833 & '_saup2: The number of shifts to apply ')
834 call pdvout (comm, logfil, np, workl, ndigit,
835 & '_saup2: shifts selected')
836 if (ishift .eq. 1) then
837 call pdvout (comm, logfil, np, bounds, ndigit,
838 & '_saup2: corresponding Ritz estimates')
839 end if
840 end if
841c
842c %---------------------------------------------------------%
843c | Apply the NP0 implicit shifts by QR bulge chasing. |
844c | Each shift is applied to the entire tridiagonal matrix. |
845c | The first 2*N locations of WORKD are used as workspace. |
846c | After pdsapps is done, we have a Lanczos |
847c | factorization of length NEV. |
848c %---------------------------------------------------------%
849c
850 call pdsapps ( comm, n, nev, np, ritz, v, ldv, h, ldh, resid,
851 & q, ldq, workd)
852c
853c %---------------------------------------------%
854c | Compute the B-norm of the updated residual. |
855c | Keep B*RESID in WORKD(1:N) to be used in |
856c | the first step of the next call to pdsaitr. |
857c %---------------------------------------------%
858c
859 cnorm = .true.
860 call second (t2)
861 if (bmat .eq. 'G') then
862 nbx = nbx + 1
863 call dcopy (n, resid, 1, workd(n+1), 1)
864 ipntr(1) = n + 1
865 ipntr(2) = 1
866 ido = 2
867c
868c %----------------------------------%
869c | Exit in order to compute B*RESID |
870c %----------------------------------%
871c
872 go to 9000
873 else if (bmat .eq. 'I') then
874 call dcopy (n, resid, 1, workd, 1)
875 end if
876c
877 100 continue
878c
879c %----------------------------------%
880c | Back from reverse communication; |
881c | WORKD(1:N) := B*RESID |
882c %----------------------------------%
883c
884 if (bmat .eq. 'G') then
885 call second (t3)
886 tmvbx = tmvbx + (t3 - t2)
887 end if
888c
889 if (bmat .eq. 'G') then
890 rnorm_buf = ddot(n, resid, 1, workd, 1)
891 call mpi_allreduce( rnorm_buf, rnorm, 1,
892 & mpi_double_precision, mpi_sum, comm, ierr )
893 rnorm = sqrt(abs(rnorm))
894 else if (bmat .eq. 'I') then
895 rnorm = pdnorm2( comm, n, resid, 1 )
896 end if
897 cnorm = .false.
898 130 continue
899c
900 if (msglvl .gt. 2) then
901 call pdvout (comm, logfil, 1, rnorm, ndigit,
902 & '_saup2: B-norm of residual for NEV factorization')
903 call pdvout (comm, logfil, nev, h(1,2), ndigit,
904 & '_saup2: main diagonal of compressed H matrix')
905 call pdvout (comm, logfil, nev-1, h(2,1), ndigit,
906 & '_saup2: subdiagonal of compressed H matrix')
907 end if
908c
909 go to 1000
910c
911c %---------------------------------------------------------------%
912c | |
913c | E N D O F M A I N I T E R A T I O N L O O P |
914c | |
915c %---------------------------------------------------------------%
916c
917 1100 continue
918c
919 mxiter = iter
920 nev = nconv
921c
922 1200 continue
923 ido = 99
924c
925c %------------%
926c | Error exit |
927c %------------%
928c
929 call second (t1)
930 tsaup2 = t1 - t0
931c
932 9000 continue
933 return
934c
935c %----------------%
936c | End of pdsaup2 |
937c %----------------%
938c
939 end