Abstract

GMBACK is a Krylov solver for linear systems in \(\mathbb{R}^n\).

We analyze here the high convergence orders (superlinear convergence) of the Newton-GMBACK methods, which can be characterized applying three different existing results.

In this note we show by some direct calculations that these characterizations are equivalent.

Authors

Emil Cătinaş
(Tiberiu Popoviciu Institute of Numerical Analysis, Romanian Academy)

Keywords

nonlinear system of equations in Rn; inexact Newton method; Krylov methods; linear systems of equation in Rn; residual; local convergence; superlinear convergence.

Cite this paper as:

E. Cătinaş, On the high convergence orders of the Newton-GMBACK methods, Rev. Anal. Numér. Théor. Approx., 28 (1999) no. 2, pp. 125-132.

PDF

About this paper

Print ISSN

1222-9024

Online ISSN

2457-8126

MR

?

ZBL

?

Google Scholar citations

[1] P.N. Brown, A theoretical comparison of the Arnoldi and GMRES algorithms, SIAM, J. Sci. Stat. Comput., 12 (1991) no. 1, pp. 58–78.

[2] P.N.  Brown and H.F.  Walker, GMRES on (nearly) singular systems,  SIAM  J. Matrix Anal. Appl., 18 (1997) no. 1, pp. 37–51.

[3] E. Catinas, A note on inexact secant methods, Rev. Anal. Numer. Theor. Approx., 25 (1996) nos. 1–2, pp. 33–41.

[4] E. Catinas, Newton and Newton-Krylov methods for solving nonlinear systems in Rn, Ph.D. Thesis, Cluj-Napoca, Romania, (to be defended).

[5] E. Catinas, Inexact perturbed Newton methods and some applications for a class of Krylov solvers, J. Optim. Theory Appl., submitted.

[6] R.S. Dembo, S.C. Eisenstat, and T. Steihaug, Inexact Newton methods, SIAM J. Numer. Anal., 19 (1982) no. 2, pp. 400–408.

[7] J.E. Dennis, Jr. and J.J. More, A characterization of superlinear convergence and its application to quasi-Newton methods, Math. Comput., 28 (1974) no. 126, pp. 549–560.

[8] J.E.  Dennis,  Jr. and J.J.  More, Quasi-Newton methods, motivation and theory, SIAM Review, 19 (1977) no. 1, pp. 46–89.

[9] J.E.  Dennis,  Jr. and R.B.  Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice-Hall Series in Computational Mathematics, Englewood Cliffs, NJ, 1983.

[10] S.C. Eisenstat and H.F. Walker, Choosing the forcing terms in an inexact Newton method, SIAM J. Sci. Comput., 17 (1996) no.1, pp. 16–32.

[11] E.M.  Kasenally, GMBACK: a generalised minimum backward error algorithm for nonsymmetric linear systems, SIAM J. Sci. Comput., 16 (1995) no. 3, pp. 698–719.

[12] E.M. Kasenally and V. Simoncini, Analysis of a minimum perturbation algorithm for nonsymmetric linear systems, SIAM J. Numer. Anal, 34 (1997) no. 1, pp. 48–66. 132

[13] C.T. Kelley, Iterative Methods for Linear and Nonlinear Equations, SIAM, Philadelphia, 1995.

[14] H.J. Martinez, Z. Parada and R.A. Tapia, On the characterization of Q-superlinear convergence of quasi-Newton interior-point methods for nonlinear programming,  Bol.Soc. Mat. Mexicana, 1 (1995) no. 3, pp. 137–148.

[15] J.M.  Ortega and W.C.  Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970.

[16] I. Pavaloiu, Introduction in the Approximation of the Solutions of Equations Theory, Ed. Dacia, Cluj-Napoca, 1976 (in Romanian).

[17] F.A.  Potra and V.  Ptak, Nondiscrete Induction and Iterative Processes,  Pitman, London, 1984.

[18] F.A. Potra, On Q-order and R-order of convergence,  J.  Optim.  Theory  Appl.,  63 (1989) no. 3, pp. 415–431.

[19] W.C.  Rheinboldt, Methods for Solving Systems of Nonlinear Equations,  SIAM, Philadelphia, 1996.

[20] Y. Saad, Krylov subspace methods for solving large unsymmetric linear systems, Math.Comp., 37 (1981) no. 155, pp. 105–126.

[21] Y. Saad, Iterative Methods for Sparse Linear Systems, PWS Pub. Co., Boston, 1996.

[22] Y. Saad and M.H. Schultz, GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7 (1986) no. 3, pp.856–869.

[23] G.W.  Stewart and J.-G.  Sun, Matrix Perturbation Theory,  Academic  Press,  New York, 1990.

[24] H.F. Walker, An approach to continuation using Krylov subspace methods, Research Report 1/97/89, Dept. of Math., Utah State University, submitted to Computational Science in the 21st Century.

Paper (preprint) in HTML form

jnaat,+Journal+manager,+1999+Catinas+-+ANTA+-+On+the+high+conv+ord+of+the+Newton-GMBACK+meths+16-10-

ON THE HIGH CONVERGENCE ORDERS OF THE NEWTON-GMBACK METHODS*

EMIL CĂTINAŞ

Abstract

The high convergence orders of the Newton-GMBACK methods can be characterized applying three different existing results. In this note we show by some direct computations that these characterizations are equivalent.

AMS Subject Classification: 65 H 10 , 65 F 10 65 H 10 , 65 F 10 65H10,65F1065 \mathrm{H} 10,65 \mathrm{~F} 1065H10,65 F10.

1. INTRODUCTION

The GMBACK method introduced by Kasenally [11, and which we shall describe in the following section, is a Krylov method for solving large linear systems. When it is used in a Newton method, the convergence orders of the resulted iterations may be characterized applying three existing results.
The first result was given by Dennis and Moré in [7], but before enouncing it we review the common setting. Given F : D R N R N F : D R N R N F:D subeR^(N)rarrR^(N)F: D \subseteq \mathbb{R}^{N} \rightarrow \mathbb{R}^{N}F:DRNRN, the local convergence of different Newton-type methods is usually studied under the following standard assumptions:
(C1) there exists y D y D y^(**)in Dy^{*} \in DyD such that F ( y ) = 0 F y = 0 F(y^(**))=0F\left(y^{*}\right)=0F(y)=0;
(C2) the mapping F F FFF is differentiable in a neighborhood of y y y^(**)y^{*}y, with the derivative F F F^(')F^{\prime}F continuous at y y y^(**)y^{*}y;
(C3) the Jacobian F ( y ) F y F^(')(y^(**))F^{\prime}\left(y^{*}\right)F(y) is nonsingular: F ( y ) 1 R N × N F y 1 R N × N EEF^(')(y^(**))^(-1)inR^(N xx N)\exists F^{\prime}\left(y^{*}\right)^{-1} \in \mathbb{R}^{N \times N}F(y)1RN×N.
We shall denote hereafter by ||*||\|\cdot\| an arbitrary fixed norm on R N R N R^(N)\mathbb{R}^{N}RN or its induced operator norm. The symbol 2 2 ||*||_(2)\|\cdot\|_{2}2 stands for the euclidean norm and F F ||*||_(F)\|\cdot\|_{F}F denotes the Frobenius norm. For definitions and results concerning the convergence orders we refer to [15, ch.9] (see also [19], [18]).
Theorem 1.1. [7]. Let F : D R N F : D R N F:D rarrR^(N)F: D \rightarrow \mathbb{R}^{N}F:DRN be differentiable in the open convex set D R N D R N D subeR^(N)D \subseteq \mathbb{R}^{N}DRN and assume that for some y D y D y^(**)in Dy^{*} \in DyD the mapping F F FFF satisfies (C2) and (C3). Let ( B k ) k 0 R N × N B k k 0 R N × N (B_(k))_(k >= 0)subR^(N xx N)\left(B_{k}\right)_{k \geq 0} \subset \mathbb{R}^{N \times N}(Bk)k0RN×N be a sequence of nonsingular matrices and
suppose that for some y 0 D y 0 D y_(0)in Dy_{0} \in Dy0D the quasi-Newton iterates given by
y k + 1 = y k B k 1 F ( y k ) , k = 0 , 1 , y k + 1 = y k B k 1 F y k , k = 0 , 1 , y_(k+1)=y_(k)-B_(k)^(-1)F(y_(k)),quad k=0,1,dotsy_{k+1}=y_{k}-B_{k}^{-1} F\left(y_{k}\right), \quad k=0,1, \ldotsyk+1=ykBk1F(yk),k=0,1,
remain in D D DDD and converge to y y y^(**)y^{*}y. Then ( y k ) k 0 y k k 0 (y_(k))_(k >= 0)\left(y_{k}\right)_{k \geq 0}(yk)k0 converges q q qqq-superlinearly to y y y^(**)y^{*}y and F ( y ) = 0 F y = 0 F(y^(**))=0F\left(y^{*}\right)=0F(y)=0 if and only if
lim k ( B k F ( y ) ) ( y k + 1 y k ) y k + 1 y k = 0 . lim k B k F y y k + 1 y k y k + 1 y k = 0 . lim_(k rarr oo)(||(B_(k)-F^(')(y^(**)))(y_(k+1)-y_(k))||)/(||y_(k+1)-y_(k)||)=0.\lim _{k \rightarrow \infty} \frac{\left\|\left(B_{k}-F^{\prime}\left(y^{*}\right)\right)\left(y_{k+1}-y_{k}\right)\right\|}{\left\|y_{k+1}-y_{k}\right\|}=0 .limk(BkF(y))(yk+1yk)yk+1yk=0.
The second result was obtained by Dembo, Eisenstat and Steihaug in 6 .
Theorem 1.2. [6]. Assume the mapping F F FFF satisfies the standard assumptions and suppose that for some y 0 D y 0 D y_(0)in Dy_{0} \in Dy0D the sequence of the inexact Newton iterates given by
F ( y k ) s k = F ( y k ) + r k y k + 1 = y k + s k , k = 0 , 1 , F y k s k = F y k + r k y k + 1 = y k + s k , k = 0 , 1 , {:[F^(')(y_(k))s_(k)=-F(y_(k))+r_(k)],[y_(k+1)=y_(k)+s_(k)","quad k=0","1","dots]:}\begin{aligned} F^{\prime}\left(y_{k}\right) s_{k} & =-F\left(y_{k}\right)+r_{k} \\ y_{k+1} & =y_{k}+s_{k}, \quad k=0,1, \ldots \end{aligned}F(yk)sk=F(yk)+rkyk+1=yk+sk,k=0,1,
remains in D D DDD and converges to y y y^(**)y^{*}y. Then ( y k ) k 0 y k k 0 (y_(k))_(k >= 0)\left(y_{k}\right)_{k \geq 0}(yk)k0 converges q q qqq-superlinearly if and only if the residuals r k r k r_(k)r_{k}rk satisfy
r k = o ( F ( y k ) ) , as k . r k = o F y k ,  as  k . ||r_(k)||=o(||F(y_(k))||),quad" as "k rarr oo.\left\|r_{k}\right\|=o\left(\left\|F\left(y_{k}\right)\right\|\right), \quad \text { as } k \rightarrow \infty .rk=o(F(yk)), as k.
Martinez, Parada and Tapia [14] obtained some results for the sequences of damped and perturbed quasi-Newton methods
y k + 1 = y k α k B k 1 ( F ( y k ) + r k ) , y k + 1 = y k α k B k 1 F y k + r k , y_(k+1)=y_(k)-alpha_(k)B_(k)^(-1)(F(y_(k))+r_(k)),y_{k+1}=y_{k}-\alpha_{k} B_{k}^{-1}\left(F\left(y_{k}\right)+r_{k}\right),yk+1=ykαkBk1(F(yk)+rk),
where 0 < α k 1 , r k R N , k = 0 , 1 , 0 < α k 1 , r k R N , k = 0 , 1 , 0 < alpha_(k) <= 1,r_(k)inR^(N),k=0,1,dots0<\alpha_{k} \leq 1, r_{k} \in \mathbb{R}^{N}, k=0,1, \ldots0<αk1,rkRN,k=0,1, and y 0 R N y 0 R N y_(0)inR^(N)y_{0} \in \mathbb{R}^{N}y0RN. Though, those results do not fully characterize the convergence orders of the above sequences. On the other hand, we shall see that if we want to fit in a direct manner the Newton-GMBACK iterates in this frame, we must reduce the above iterations either to the classical quasi-Newton or to the IN ones.
We have recently extended Theorem 1.2,
Theorem 1.3. [5]. Assume the mapping F F FFF satisfies the standard assumptions and consider the following elements: ( Δ k ) k 0 R N × N Δ k k 0 R N × N (Delta_(k))_(k >= 0)subR^(N xx N)\left(\Delta_{k}\right)_{k \geq 0} \subset \mathbb{R}^{N \times N}(Δk)k0RN×N (perturbations in the Jacobians), ( δ k ) k 0 R N δ k k 0 R N (delta_(k))_(k >= 0)subR^(N)\left(\delta_{k}\right)_{k \geq 0} \subset \mathbb{R}^{N}(δk)k0RN (perturbations in the function evaluations) and ( r ^ k ) k 0 R N r ^ k k 0 R N ( hat(r)_(k))_(k >= 0)subR^(N)\left(\hat{r}_{k}\right)_{k \geq 0} \subset \mathbb{R}^{N}(r^k)k0RN (residuals of the approximate solutions s k s k s_(k)s_{k}sk to the perturbed linear systems ( F ( y k ) + Δ k ) s = F ( y k ) + δ k ) F y k + Δ k s = F y k + δ k {:(F^(')(y_(k))+Delta_(k))s=-F(y_(k))+delta_(k))\left.\left(F^{\prime}\left(y_{k}\right)+\Delta_{k}\right) s=-F\left(y_{k}\right)+\delta_{k}\right)(F(yk)+Δk)s=F(yk)+δk). If for some y 0 D y 0 D y_(0)in Dy_{0} \in Dy0D the sequence of the inexact perturbed Newton iterates given by
( F ( y k ) + Δ k ) s k = ( F ( y k ) + δ k ) + r ^ k y k + 1 = y k + s k , k = 0 , 1 , F y k + Δ k s k = F y k + δ k + r ^ k y k + 1 = y k + s k , k = 0 , 1 , {:[(F^(')(y_(k))+Delta_(k))s_(k)=(-F(y_(k))+delta_(k))+ hat(r)_(k)],[y_(k+1)=y_(k)+s_(k)","quad k=0","1","dots]:}\begin{aligned} \left(F^{\prime}\left(y_{k}\right)+\Delta_{k}\right) s_{k} & =\left(-F\left(y_{k}\right)+\delta_{k}\right)+\hat{r}_{k} \\ y_{k+1} & =y_{k}+s_{k}, \quad k=0,1, \ldots \end{aligned}(F(yk)+Δk)sk=(F(yk)+δk)+r^kyk+1=yk+sk,k=0,1,
is well defined (i.e. the matrices F ( y k ) + Δ k F y k + Δ k F^(')(y_(k))+Delta_(k)F^{\prime}\left(y_{k}\right)+\Delta_{k}F(yk)+Δk are nonsingular and the iterates y k y k y_(k)y_{k}yk remain in D D DDD ) and converges to y y y^(**)y^{*}y, then the convergence is q q qqq-superlinear if and only if
Δ k ( F ( y k ) + Δ k ) 1 F ( y k ) + ( I Δ k ( F ( y k ) + Δ k ) 1 ) ( δ k + r ^ k ) = o ( F ( y k ) ) Δ k F y k + Δ k 1 F y k + I Δ k F y k + Δ k 1 δ k + r ^ k = o F y k ||Delta_(k)(F^(')(y_(k))+Delta_(k))^(-1)F(y_(k))+(I-Delta_(k)(F^(')(y_(k))+Delta_(k))^(-1))(delta_(k)+ hat(r)_(k))||=o(||F(y_(k))||)\left\|\Delta_{k}\left(F^{\prime}\left(y_{k}\right)+\Delta_{k}\right)^{-1} F\left(y_{k}\right)+\left(I-\Delta_{k}\left(F^{\prime}\left(y_{k}\right)+\Delta_{k}\right)^{-1}\right)\left(\delta_{k}+\hat{r}_{k}\right)\right\|=o\left(\left\|F\left(y_{k}\right)\right\|\right)Δk(F(yk)+Δk)1F(yk)+(IΔk(F(yk)+Δk)1)(δk+r^k)=o(F(yk)),
as k k k rarr ook \rightarrow \inftyk.
Remark. Modest additional continuity conditions imposed on the derivative F F F^(')F^{\prime}F allow the characterizations of the q q qqq-convergence orders 1 + p , p ( 0 , 1 ] 1 + p , p ( 0 , 1 ] 1+p,p in(0,1]1+p, p \in(0,1]1+p,p(0,1] of the three methods considered above. For these extensions we refer the reader to [8], 6 and resp. [5].
In Section 2 we briefly describe the GMBACK method and we deduce some results, while in Section 3 we show that when applying the above three theorems for the characterization of the q q qqq-superlinear convergence of the NewtonGMBACK method we are led to some equivalent results.

2. THE GMBACK METHOD

Consider the linear nonsingular system
A x = b A x = b Ax=bA x=bAx=b
with A R N × N A R N × N A inR^(N xx N)A \in \mathbb{R}^{N \times N}ARN×N and b R N b R N b inR^(N)b \in \mathbb{R}^{N}bRN. The GMBACK algorithm introduced by Kasenally in [11] belongs to the class of Krylov methods for solving such systems when the dimension N N NNN is large. Given the initial approximation x 0 R N x 0 R N x_(0)inR^(N)x_{0} \in \mathbb{R}^{N}x0RN of the true solution x x x^(**)x^{*}x and a number m { 1 , , N } m { 1 , , N } m in{1,dots,N}m \in\{1, \ldots, N\}m{1,,N}, by a modified GramSchmidt procedure there is constructed an orthonormal basis { v 1 , , v m } v 1 , , v m {v_(1),dots,v_(m)}\left\{v_{1}, \ldots, v_{m}\right\}{v1,,vm} in the Krylov subspace K m = K m ( A , r 0 ) = span { r 0 , A r 0 , , A m 1 r 0 } K m = K m A , r 0 = span r 0 , A r 0 , , A m 1 r 0 K_(m)=K_(m)(A,r_(0))=span{r_(0),Ar_(0),dots,A^(m-1)r_(0)}\mathcal{K}_{m}=\mathcal{K}_{m}\left(A, r_{0}\right)=\operatorname{span}\left\{r_{0}, A r_{0}, \ldots, A^{m-1} r_{0}\right\}Km=Km(A,r0)=span{r0,Ar0,,Am1r0}, where r 0 = b A x 0 r 0 = b A x 0 r_(0)=b-Ax_(0)r_{0}=b-A x_{0}r0=bAx0 is the residual of the initial approximation. Finally, GMBACK determines an approximation x m G B x 0 + K m x m G B x 0 + K m x_(m)^(GB)inx_(0)+K_(m)x_{m}^{G B} \in x_{0}+\mathcal{K}_{m}xmGBx0+Km which solves the following minimization problem:
min x m x 0 + K m Δ A F subject to ( A Δ A ) x m = b . min x m x 0 + K m Δ A F  subject to  A Δ A x m = b . min_(x_(m)inx_(0)+K_(m))||Delta_(A)||_(F)quad" subject to "(A-Delta_(A))x_(m)=b.\min _{x_{m} \in x_{0}+\mathcal{K}_{m}}\left\|\Delta_{A}\right\|_{F} \quad \text { subject to }\left(A-\Delta_{A}\right) x_{m}=b .minxmx0+KmΔAF subject to (AΔA)xm=b.
The following steps are performed for determining x m G B x m G B x_(m)^(GB)x_{m}^{G B}xmGB :

Arnoldi

  • Let r 0 = b A x 0 , β = r 0 2 r 0 = b A x 0 , β = r 0 2 r_(0)=b-Ax_(0),beta=||r_(0)||_(2)r_{0}=b-A x_{0}, \beta=\left\|r_{0}\right\|_{2}r0=bAx0,β=r02 and v 1 = 1 β r 0 v 1 = 1 β r 0 v_(1)=(1)/(beta)r_(0)v_{1}=\frac{1}{\beta} r_{0}v1=1βr0;
  • For j = 1 , , m j = 1 , , m j=1,dots,mj=1, \ldots, mj=1,,m do
h i j = ( A v j , v i ) , i = 1 , , j v ^ j + 1 = A v ^ j j = 1 i h i j v i h j + 1 , j = v ^ j + 1 2 v j + 1 = 1 h j + 1 , j v ^ j + 1 h i j = A v j , v i , i = 1 , , j v ^ j + 1 = A v ^ j j = 1 i h i j v i h j + 1 , j = v ^ j + 1 2 v j + 1 = 1 h j + 1 , j v ^ j + 1 {:[h_(ij)=(Av_(j),v_(i))","quad i=1","dots","j],[ hat(v)_(j+1)=A hat(v)_(j)-sum_(j=1)^(i)h_(ij)v_(i)],[h_(j+1,j)=|| hat(v)_(j+1)||_(2)],[v_(j+1)=(1)/(h_(j+1,j)) hat(v)_(j+1)]:}\begin{aligned} & h_{i j}=\left(A v_{j}, v_{i}\right), \quad i=1, \ldots, j \\ & \hat{v}_{j+1}=A \hat{v}_{j}-\sum_{j=1}^{i} h_{i j} v_{i} \\ & h_{j+1, j}=\left\|\hat{v}_{j+1}\right\|_{2} \\ & v_{j+1}=\frac{1}{h_{j+1, j}} \hat{v}_{j+1} \end{aligned}hij=(Avj,vi),i=1,,jv^j+1=Av^jj=1ihijvihj+1,j=v^j+12vj+1=1hj+1,jv^j+1
  • Form the Hessenberg matrix H ¯ m R ( m + 1 ) × m H ¯ m R ( m + 1 ) × m bar(H)_(m)inR^((m+1)xx m)\bar{H}_{m} \in \mathbb{R}^{(m+1) \times m}H¯mR(m+1)×m with the (possible) nonzero elements h i j h i j h_(ij)h_{i j}hij computed above and the matrix V m R N × m V m R N × m V_(m)inR^(N xx m)V_{m} \in \mathbb{R}^{N \times m}VmRN×m having on columns the vectors v j : V m = [ v 1 v m ] v j : V m = v 1 v m v_(j):V_(m)=[v_(1)dotsv_(m)]v_{j}: V_{m}=\left[v_{1} \ldots v_{m}\right]vj:Vm=[v1vm];

GMBACK

  • Let H ^ m = [ β e 1 H ¯ m ] R ( m + 1 ) × ( m + 1 ) , G ^ m = [ x 0 V m ] R N × ( m + 1 ) H ^ m = β e 1      H ¯ m R ( m + 1 ) × ( m + 1 ) , G ^ m = x 0      V m R N × ( m + 1 ) hat(H)_(m)=[[-betae_(1), bar(H)_(m)]]inR^((m+1)xx(m+1)),quad hat(G)_(m)=[[x_(0),V_(m)]]inR^(N xx(m+1))\hat{H}_{m}=\left[\begin{array}{ll}-\beta e_{1} & \bar{H}_{m}\end{array}\right] \in \mathbb{R}^{(m+1) \times(m+1)}, \quad \hat{G}_{m}=\left[\begin{array}{ll}x_{0} & V_{m}\end{array}\right] \in \mathbb{R}^{N \times(m+1)}H^m=[βe1H¯m]R(m+1)×(m+1),G^m=[x0Vm]RN×(m+1)
P = H ^ m t H ^ m R ( m + 1 ) × ( m + 1 ) and Q = G ^ m t G ^ m R ( m + 1 ) × ( m + 1 ) ; P = H ^ m t H ^ m R ( m + 1 ) × ( m + 1 )  and  Q = G ^ m t G ^ m R ( m + 1 ) × ( m + 1 ) ; P= hat(H)_(m)^(t) hat(H)_(m)inR^((m+1)xx(m+1))quad" and "quad Q= hat(G)_(m)^(t) hat(G)_(m)inR^((m+1)xx(m+1));P=\hat{H}_{m}^{t} \hat{H}_{m} \in \mathbb{R}^{(m+1) \times(m+1)} \quad \text { and } \quad Q=\hat{G}_{m}^{t} \hat{G}_{m} \in \mathbb{R}^{(m+1) \times(m+1)} ;P=H^mtH^mR(m+1)×(m+1) and Q=G^mtG^mR(m+1)×(m+1);
  • Determine an eigenvector u m + 1 u m + 1 u_(m+1)u_{m+1}um+1 corresponding to the smallest eigenvalue λ m + 1 G B λ m + 1 G B lambda_(m+1)^(GB)\lambda_{m+1}^{G B}λm+1GB of the generalized eigenproblem P u = λ Q u P u = λ Q u Pu=lambda QuP u=\lambda Q uPu=λQu;
  • If the first component u m + 1 ( 1 ) u m + 1 ( 1 ) u_(m+1)^((1))u_{m+1}^{(1)}um+1(1) is nonzero, compute the vector y m G B R m y m G B R m y_(m)^(GB)inR^(m)y_{m}^{G B} \in \mathbb{R}^{m}ymGBRm by scaling u m + 1 u m + 1 u_(m+1)u_{m+1}um+1 such that
[ 1 y m G B ] = 1 u m + 1 ( 1 ) u m + 1 1 y m G B = 1 u m + 1 ( 1 ) u m + 1 [[1],[y_(m)^(GB)]]=(1)/(u_(m+1)^((1)))u_(m+1)\left[\begin{array}{c} 1 \\ y_{m}^{G B} \end{array}\right]=\frac{1}{u_{m+1}^{(1)}} u_{m+1}[1ymGB]=1um+1(1)um+1
  • Set x m G B = x 0 + V m y m G B x m G B = x 0 + V m y m G B x_(m)^(GB)=x_(0)+V_(m)y_(m)^(GB)x_{m}^{G B}=x_{0}+V_{m} y_{m}^{G B}xmGB=x0+VmymGB.
This algorithm may lead to two possible breakdowns, either in the Arnoldi method or in the scaling of u m + 1 u m + 1 u_(m+1)u_{m+1}um+1. The first one is as for GMRES a happy breakdown, because the solution may be determined exactly using H ¯ m H ¯ m bar(H)_(m)\bar{H}_{m}H¯m and V m V m V_(m)V_{m}Vm. The second one appears when all the eigenvectors associated to λ m + 1 G B λ m + 1 G B lambda_(m+1)^(GB)\lambda_{m+1}^{G B}λm+1GB have the first component zero, the inevitable divisions by zero leading to uncircumventible breakdowns. In such a case either m m mmm is increased or the algorithm is restarted with a different initial approximation x 0 x 0 x_(0)x_{0}x0. We shall assume in the following analysis that x m G B x m G B x_(m)^(GB)x_{m}^{G B}xmGB exist.
It is worth noting that the algorithm may be used in the restarted version, by taking after the m m mmm steps the computed solution x m G B x m G B x_(m)^(GB)x_{m}^{G B}xmGB as the new initial approximation.
We shall prove the following result:
Proposition 2.1. Consider some arbitrary elements x 0 R N x 0 R N x_(0)inR^(N)x_{0} \in \mathbb{R}^{N}x0RN and m { 1 , , N } m { 1 , , N } m in{1,dots,N}m \in \{1, \ldots, N\}m{1,,N}. If there exists a GMBACK solution x m G B x m G B x_(m)^(GB)x_{m}^{G B}xmGB, then its corresponding
backward error Δ A , m G B R N × N Δ A , m G B R N × N Delta_(A,m)^(GB)inR^(N xx N)\Delta_{A, m}^{G B} \in \mathbb{R}^{N \times N}ΔA,mGBRN×N satisfies
Δ A , m G B x m G B 2 = r m G B 2 . Δ A , m G B x m G B 2 = r m G B 2 . ||Delta_(A,m)^(GB)*x_(m)^(GB)||_(2)=||r_(m)^(GB)||_(2).\left\|\Delta_{A, m}^{G B} \cdot x_{m}^{G B}\right\|_{2}=\left\|r_{m}^{G B}\right\|_{2} .ΔA,mGBxmGB2=rmGB2.
Proof. Kasenally [11] proved that the backward error Δ A , m G B Δ A , m G B Delta_(A,m)^(GB)\Delta_{A, m}^{G B}ΔA,mGB corresponding to x m G B x m G B x_(m)^(GB)x_{m}^{G B}xmGB is given by
(2.1) Δ A , m G B = V m + 1 ( H ¯ m y m G B β e 1 ) ( x m G B ) t x m G B 2 2 (2.1) Δ A , m G B = V m + 1 H ¯ m y m G B β e 1 x m G B t x m G B 2 2 {:(2.1)Delta_(A,m)^(GB)=V_(m+1)( bar(H)_(m)y_(m)^(GB)-betae_(1))((x_(m)^(GB))^(t))/(||x_(m)^(GB)||_(2)^(2)):}\begin{equation*} \Delta_{A, m}^{G B}=V_{m+1}\left(\bar{H}_{m} y_{m}^{G B}-\beta e_{1}\right) \frac{\left(x_{m}^{G B}\right)^{t}}{\left\|x_{m}^{G B}\right\|_{2}^{2}} \tag{2.1} \end{equation*}(2.1)ΔA,mGB=Vm+1(H¯mymGBβe1)(xmGB)txmGB22
The matrices V m + 1 V m + 1 V_(m+1)V_{m+1}Vm+1 and H ¯ m H ¯ m bar(H)_(m)\bar{H}_{m}H¯m computed in the Arnoldi algorithm satisfy the following known relation (see for example [22]):
A V m = V m + 1 H ¯ m , A V m = V m + 1 H ¯ m , AV_(m)=V_(m+1) bar(H)_(m),A V_{m}=V_{m+1} \bar{H}_{m},AVm=Vm+1H¯m,
which shows that
V m + 1 ( H ¯ m y m G B β e 1 ) 2 = A V m y m G B r 0 2 = A V m y m G B + A x 0 b 2 = A x m G B b 2 = r m G B 2 . V m + 1 H ¯ m y m G B β e 1 2 = A V m y m G B r 0 2 = A V m y m G B + A x 0 b 2 = A x m G B b 2 = r m G B 2 . {:[||V_(m+1)( bar(H)_(m)y_(m)^(GB)-betae_(1))||_(2)=||AV_(m)y_(m)^(GB)-r_(0)||_(2)],[=||AV_(m)y_(m)^(GB)+Ax_(0)-b||_(2)],[=||Ax_(m)^(GB)-b||_(2)],[=||r_(m)^(GB)||_(2).]:}\begin{aligned} \left\|V_{m+1}\left(\bar{H}_{m} y_{m}^{G B}-\beta e_{1}\right)\right\|_{2} & =\left\|A V_{m} y_{m}^{G B}-r_{0}\right\|_{2} \\ & =\left\|A V_{m} y_{m}^{G B}+A x_{0}-b\right\|_{2} \\ & =\left\|A x_{m}^{G B}-b\right\|_{2} \\ & =\left\|r_{m}^{G B}\right\|_{2} . \end{aligned}Vm+1(H¯mymGBβe1)2=AVmymGBr02=AVmymGB+Ax0b2=AxmGBb2=rmGB2.
Taking into account (2.1), we are led to the stated result.

3. THE SUPERLINEAR CONVERGENCE OF THE NEWTON-GMBACK METHOD

The Newton-GMBACK iterates may be written in two equivalent ways
(3.1) ( F ( y k ) Δ A G B ) s k G B = F ( y k ) (3.2) F ( y k ) s k G B = F ( y k ) + r k G B , k = 0 , 1 , , (3.1) F y k Δ A G B s k G B = F y k (3.2) F y k s k G B = F y k + r k G B , k = 0 , 1 , , {:[(3.1)(F^(')(y_(k))-Delta_(A)^(GB))s_(k)^(GB)=-F(y_(k))],[(3.2)F^(')(y_(k))s_(k)^(GB)=-F(y_(k))+r_(k)^(GB)","quad k=0","1","dots","]:}\begin{align*} \left(F^{\prime}\left(y_{k}\right)-\Delta_{A}^{G B}\right) s_{k}^{G B} & =-F\left(y_{k}\right) \tag{3.1}\\ F^{\prime}\left(y_{k}\right) s_{k}^{G B} & =-F\left(y_{k}\right)+r_{k}^{G B}, \quad k=0,1, \ldots, \tag{3.2} \end{align*}(3.1)(F(yk)ΔAGB)skGB=F(yk)(3.2)F(yk)skGB=F(yk)+rkGB,k=0,1,,
where for the first writting we have considered the linear systems A k s = b k A k s = b k A_(k)s=b_(k)A_{k} s=b_{k}Aks=bk with A k = F ( y k ) A k = F y k A_(k)=F^(')(y_(k))A_{k}=F^{\prime}\left(y_{k}\right)Ak=F(yk) and b k = F ( y k ) b k = F y k b_(k)=-F(y_(k))b_{k}=-F\left(y_{k}\right)bk=F(yk).
Applying Theorem 1.1 of Dennis and Moré for the first writing of the iterates we get
Theorem 3.1. Assume the standard conditions hold and that for a given element y 0 D y 0 D y_(0)in Dy_{0} \in Dy0D, the sequence of Newton-GMBACK iterates is well defined and converges to y y y^(**)y^{*}y. Then the convergence is q q qqq-superlinear if and only if
r k G B 2 = o ( s k G B 2 ) , as k . r k G B 2 = o s k G B 2 ,  as  k . ||r_(k)^(GB)||_(2)=o(||s_(k)^(GB)||_(2)),quad" as "k rarr oo.\left\|r_{k}^{G B}\right\|_{2}=o\left(\left\|s_{k}^{G B}\right\|_{2}\right), \quad \text { as } k \rightarrow \infty .rkGB2=o(skGB2), as k.
Proof. We first observe that under the notations of formula (3.1), B k = F ( y k ) Δ A k G B B k = F y k Δ A k G B B_(k)=F^(')(y_(k))-Delta_(A_(k))^(GB)B_{k}= F^{\prime}\left(y_{k}\right)-\Delta_{A_{k}}^{G B}Bk=F(yk)ΔAkGB such that B k F ( y k ) = Δ A k G B B k F y k = Δ A k G B B_(k)-F^(')(y_(k))=-Delta_(A_(k))^(GB)B_{k}-F^{\prime}\left(y_{k}\right)=-\Delta_{A_{k}}^{G B}BkF(yk)=ΔAkGB. Next, Theorem 1.1 and Proposition 2.1 lead to the stated affirmation.
Theorem 1.2 of Dembo, Eisenstat and Steihaug applied to the writting 3.2 yields
Theorem 3.2. Under the same assumptions of Theorem 3.1, the convergence of the Newton-GMBACK iterates is q q qqq-superlinear if and only if
r k G B = o ( F ( y k ) ) , as k . r k G B = o F y k ,  as  k . ||r_(k)^(GB)||=o(||F(y_(k))||),quad" as "k rarr oo.\left\|r_{k}^{G B}\right\|=o\left(\left\|F\left(y_{k}\right)\right\|\right), \quad \text { as } k \rightarrow \infty .rkGB=o(F(yk)), as k.
One can see that the damped and perturbed quasi-Newton method of Martinez, Parada and Tapia must be reduced either to the quasi-Newton or to the IN method if we want to fit the Newton-GMBACK iterates in this frame.
Regarding the Newton-GMBACK iterates as IPN iterates we get
Theorem 3.3. Under the same assumptions of Theorem 3.1, the convergence of the Newton-GMBACK iterates is q q qqq-superlinear if and only if
r k G B = o ( F ( y k ) ) , as k . r k G B = o F y k ,  as  k . ||r_(k)^(GB)||=o(||F(y_(k))||),quad" as "k rarr oo.\left\|r_{k}^{G B}\right\|=o\left(\left\|F\left(y_{k}\right)\right\|\right), \quad \text { as } k \rightarrow \infty .rkGB=o(F(yk)), as k.
Proof. Taking Δ k = Δ A k G B , δ k = 0 Δ k = Δ A k G B , δ k = 0 Delta_(k)=-Delta_(A_(k))^(GB),delta_(k)=0\Delta_{k}=-\Delta_{A_{k}}^{G B}, \delta_{k}=0Δk=ΔAkGB,δk=0 and r ^ k = 0 , k = 0 , 1 , r ^ k = 0 , k = 0 , 1 , hat(r)_(k)=0,k=0,1,dots\hat{r}_{k}=0, k=0,1, \ldotsr^k=0,k=0,1, in Theorem 3.3 we are led by the writting 3.1 and by Proposition 2.1 to the stated affirmation.
Remark. Since the proofs from our IPN model relied on the IN one, Theorems 3.2 and 3.3 were expected to yield the same results.
In order to complete our initial assertion we see that it suffices to show that ( F ( y k ) ) k 0 F y k k 0 (||F(y_(k))||)_(k >= 0)\left(\left\|F\left(y_{k}\right)\right\|\right)_{k \geq 0}(F(yk))k0 and ( s k ) k 0 s k k 0 (||s_(k)||)_(k >= 0)\left(\left\|s_{k}\right\|\right)_{k \geq 0}(sk)k0 have the same rate of convergence. This is true by the following considerations.
Walker proved for an arbitrary sequence ( y k ) k 0 R N y k k 0 R N (y_(k))_(k >= 0)subR^(N)\left(y_{k}\right)_{k \geq 0} \subset \mathbb{R}^{N}(yk)k0RN that it converges q q qqq-superlinearly only at the same time with ( y k + 1 y k ) k 0 y k + 1 y k k 0 (y_(k+1)-y_(k))_(k >= 0)\left(y_{k+1}-y_{k}\right)_{k \geq 0}(yk+1yk)k0.
Lemma 3.4. [24]. Let ( y k ) k 0 R N y k k 0 R N (y_(k))_(k >= 0)subR^(N)\left(y_{k}\right)_{k \geq 0} \subset \mathbb{R}^{N}(yk)k0RN be a convergent sequence and denote s k = y k + 1 y k , k = 0 , 1 , s k = y k + 1 y k , k = 0 , 1 , s_(k)=y_(k+1)-y_(k),k=0,1,dotss_{k}=y_{k+1}-y_{k}, k=0,1, \ldotssk=yk+1yk,k=0,1, Then ( y k ) k 0 y k k 0 (y_(k))_(k >= 0)\left(y_{k}\right)_{k \geq 0}(yk)k0 converges q q qqq-superlinearly if and only if ( s k ) k 0 s k k 0 (s_(k))_(k >= 0)\left(s_{k}\right)_{k \geq 0}(sk)k0 converges q q qqq-superlinearly.
The fact that ( y k ) k 0 y k k 0 (y_(k))_(k >= 0)\left(y_{k}\right)_{k \geq 0}(yk)k0 and ( s k ) k 0 s k k 0 (s_(k))_(k >= 0)\left(s_{k}\right)_{k \geq 0}(sk)k0 have precisely the same rate when converging q q qqq-superlinearly is known for a longer time, by a result of Dennis and Moré.
Lemma 3.5. [7]. In the hypotheses of the above Lemma, if ( y k ) k 0 y k k 0 (y_(k))_(k >= 0)\left(y_{k}\right)_{k \geq 0}(yk)k0 converges q q qqq-superlinearly at y y y^(**)y^{*}y, then
lim k y y k y k + 1 y k = 1 lim k y y k y k + 1 y k = 1 lim_(k rarr oo)(||y^(**)-y_(k)||)/(||y_(k+1)-y_(k)||)=1\lim _{k \rightarrow \infty} \frac{\left\|y^{*}-y_{k}\right\|}{\left\|y_{k+1}-y_{k}\right\|}=1limkyykyk+1yk=1
The connection between the rates of y y k y y k ||y^(**)-y_(k)||\left\|y^{*}-y_{k}\right\|yyk and F ( y k ) F y k ||F(y_(k))||\left\|F\left(y_{k}\right)\right\|F(yk) is expressed by the following result obtained by Dembo, Eisenstat and Steihaug:
Lemma 3.6. [6]. Under the standard assumptions on F F FFF there exists ε , β > 0 ε , β > 0 epsi,beta > 0\varepsilon, \beta>0ε,β>0 such that
1 β y y F ( y ) β y y for all y y ε . 1 β y y F ( y ) β y y  for all  y y ε . (1)/(beta)||y^(**)-y|| <= ||F(y)|| <= beta||y^(**)-y||quad" for all "||y-y^(**)|| <= epsi.\frac{1}{\beta}\left\|y^{*}-y\right\| \leq\|F(y)\| \leq \beta\left\|y^{*}-y\right\| \quad \text { for all }\left\|y-y^{*}\right\| \leq \varepsilon .1βyyF(y)βyy for all yyε.
The equivalence of the characterizations is now completed.

REFERENCES

[1] P.N. Brown, A theoretical comparison of the Arnoldi and GMRES algorithms, SIAM J. Sci. Stat. Comput., 12 (1991) no. 1, pp. 58-78. 징
[2] P.N. Brown and H.F. Walker, GMRES on (nearly) singular systems, SIAM J. Matrix Anal. Appl., 18 (1997) no. 1, pp. 37-51. ©
[3] E. Cătinaş, A note on inexact secant methods, Rev. Anal. Numér. Théor. Approx., 25 (1996) nos. 1-2, pp. 33-41. ㄸ
[4] E. Cătinaş, Newton and Newton-Krylov methods for solving nonlinear systems in R n R n R^(n)\mathbb{R}^{n}Rn, Ph.D. Thesis, Cluj-Napoca, Romania, (to be defended).
[5] E. Cătinaş, Inexact perturbed Newton methods and some applications for a class of Krylov solvers, J. Optim. Theory Appl., submitted. 뜸
[6] R.S. Dembo, S.C. Eisenstat, and T. Steihaug, Inexact Newton methods, SiAM J. Numer. Anal., 19 (1982) no. 2, pp. 400-408. 뜬
[7] J.E. Dennis, Jr. and J.J. Moré, A characterization of superlinear convergence and its application to quasi-Newton methods, Math. Comput., 28 (1974) no. 126, pp. 549-560. г
[8] J.E. Dennis, Jr. and J.J. Moré, Quasi-Newton methods, motivation and theory, SIAM Review, 19 (1977) no. 1, pp. 46-89. ©
[9] J.E. Dennis, Jr. and R.B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice-Hall Series in Computational Mathematics, Englewood Cliffs, NJ, 1983.
[10] S.C. Eisenstat and H.F. Walker, Choosing the forcing terms in an inexact Newton method, SIAM J. Sci. Comput., 17 (1996) no.1, pp. 16-32. ㄸ
[11] E.M. Kasenally, GMBACK: a generalised minimum backward error algorithm for nonsymmetric linear systems, SIAM J. Sci. Comput., 16 (1995) no. 3, pp. 698-719. «
[12] E.M. Kasenally and V. Simoncini, Analysis of a minimum perturbation algorithm for nonsymmetric linear systems, SIAM J. Numer. Anal, 34 (1997) no. 1, pp. 48-66. 중
[13] C.T. Kelley, Iterative Methods for Linear and Nonlinear Equations, SIAM, Philadelphia, 1995.
[14] H.J. Martinez, Z. Parada and R.A. Tapia, On the characterization of Q Q QQQ-superlinear convergence of quasi-Newton interior-point methods for nonlinear programming, Bol. Soc. Mat. Mexicana, 1 (1995) no. 3, pp. 137-148.
[15] J.M. Ortega and W.C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970.
[16] I. Păvăloiu, Introduction in the Approximation of the Solutions of Equations Theory, Ed. Dacia, Cluj-Napoca, 1976 (in Romanian). ㄸ
[17] F.A. Potra and V. Pták, Nondiscrete Induction and Iterative Processes, Pitman, London, 1984.
[18] F.A. Potra, On Q-order and R-order of convergence, J. Optim. Theory Appl., 63 (1989) no. 3, pp. 415-431. ㄸ
[19] W.C. Rheinboldt, Methods for Solving Systems of Nonlinear Equations, SIAM, Philadelphia, 1996.
[20] Y. Saad, Krylov subspace methods for solving large unsymmetric linear systems, Math. Comp., 37 (1981) no. 155, pp. 105-126. ㄸ
[21] Y. Saad, Iterative Methods for Sparse Linear Systems, PWS Pub. Co., Boston, 1996.
[22] Y. Saad and M.H. Schultz, GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7 (1986) no. 3, pp. 856-869.
[23] G.W. Stewart and J.-G. Sun, Matrix Perturbation Theory, Academic Press, New York, 1990.
[24] H.F. Walker, An approach to continuation using Krylov subspace methods, Research Report 1/97/89, Dept. of Math., Utah State University, submitted to Computational Science in the 21st Century.
Received March 3, 1998
"T. Popoviciu" Institute of Numerical Analysis
Romanian Academy
P.O. Box 68, 3400 Cluj-Napoca 1
Romania

  1. *Work supported by the Romanian Academy under grant GAR 97/1999.
1999

Related Posts