The relationship between the models of perturbed Newton iterations, with applications

Abstract

The quasi-Newton method and the inexact Newton method are classical models of Newton iterates which take into account certain error terms. We have recently introduced the inexact perturbed Newton method, which takes into account all the possible sources of perturbations. We show that these models are in fact equivalent, in the sense that each one may be used to characterize the high convergence orders of the other two.

Authors

Emil Catinas, Tiberiu Popoviciu Institute of Numerical Analysis, Romanian Academy

Emil Cătinaş

Keywords

?

Paper coordinates

E. Cătinaş, The relationship between the models of perturbed Newton iterations, with applications, PAMM, 5 (2005) no. 1, pp. 785-786, https://doi.org/10.1002/pamm.200510367

PDF

About this paper

Journal

PAMM · Proc. Appl. Math. Mech.

Publisher Name

Wiley

DOI
Print ISSN

1617-7061

Online ISSN
1617-7061

google scholar link

[1] E. Catinas, Inexact perturbed Newton methods and applications to a class of Krylov solvers, J. Optim. Theory Appl. 108 (2001),543–570.
[2] E. Catinas, The inexact, inexact perturbed and quasi-Newton methods are equivalent models, Math. Comp., 74 (2005) no. 249, pp.291–301.
[3] R. S. Dembo, S. C. Eisenstat and T. Steihaug, Inexact Newton methods, SIAM J. Numer. Anal. 19 (1982), 400–408.
[4] J. E. Dennis, Jr. and J. J. Mor´e, A characterization of superlinear convergence and its application to quasi-Newton methods, Math.Comp. 28 (1974), 549–560.
[5] E. M. Kasenally, GMBACK: a generalised minimum backward error algorithm for nonsymmetric linear systems, SIAM J. Sci. Comput.16 (1995), 698–719.
[6] E. M. Kasenally and V. Simoncini, Analysis of a minimum perturbation algorithm for nonsymmetric linear systems, SIAM J. Numer.Anal. 34 (1997), 48–66.
[7] C. T. Kelley, Iterative Methods for Linear and Nonlinear Equations, SIAM, Philadelphia, 1995.
[8] J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970.
[9] F. A. Potra, Q-superlinear convergence of the iterates in primal-dual interior-point methods, Math. Progr., 91 no. 1, Ser. A, pp. 99–115,2001.[10] W. C. Rheinboldt, Methods for Solving Systems of Nonlinear Equations, Second ed., SIAM, Philadelphia, PA, 1998.

Paper (preprint) in HTML form

2005-Catinas-PAMM-The-relationship-between-the-models-of-perturbed-Newton-iterations-with-applicatio

The relationship between the models of perturbed Newton iterations, with applications

Emil Cătinaş*"T. Popoviciu" Institute of Numerical Analysis, P.O. Box 68-1, Cluj-Napoca, Romania.

Abstract

The quasi-Newton method and the inexact Newton method are classical models of Newton iterates which take into account certain error terms. We have recently introduced the inexact perturbed Newton method, which takes into account all the possible sources of perturbations. We show that these models are in fact equivalent, in the sense that each one may be used to characterize the high convergence orders of the other two.

1 The equivalence of the inexact and quasi-Newton methods

Let F ( x ) = 0 F ( x ) = 0 F(x)=0F(x)=0F(x)=0 be a nonlinear system, with F : D R n R n F : D R n R n F:D subeR^(n)rarrR^(n)F: D \subseteq \mathbb{R}^{n} \rightarrow \mathbb{R}^{n}F:DRnRn. The local convergence of the Newton iterates
F ( x k ) s k = F ( x k ) , x k + 1 = x k + s k , k = 0 , 1 , , x 0 D , F x k s k = F x k , x k + 1 = x k + s k , k = 0 , 1 , , x 0 D , F^(')(x_(k))s_(k)=-F(x_(k)),quadx_(k+1)=x_(k)+s_(k),quad k=0,1,dots,quadx_(0)in D,F^{\prime}\left(x_{k}\right) s_{k}=-F\left(x_{k}\right), \quad x_{k+1}=x_{k}+s_{k}, \quad k=0,1, \ldots, \quad x_{0} \in D,F(xk)sk=F(xk),xk+1=xk+sk,k=0,1,,x0D,
is studied under the following usual conditions (which will be implicitly assumed hereafter):
  • there exists x int D x int D x^(**)in int Dx^{*} \in \operatorname{int} DxintD such that F ( x ) = 0 F x = 0 F(x^(**))=0F\left(x^{*}\right)=0F(x)=0;
  • the mapping F F FFF is Fréchet differentiable on int D int D int D\operatorname{int} DintD, with F F F^(')F^{\prime}F continuous at x x x^(**)x^{*}x;
  • the Jacobian F ( x ) F x F^(')(x^(**))F^{\prime}\left(x^{*}\right)F(x) is invertible.
Given a norm ||*||\|\cdot\| on R n R n R^(n)\mathbb{R}^{n}Rn, these hypotheses assure the existence of a radius r > 0 r > 0 r > 0r>0r>0 such that the Newton iterates converge q q qqq-superlinearly to x x x^(**)x^{*}x for any initial approximation x 0 x 0 x_(0)x_{0}x0 with x 0 x < r x 0 x < r ||x_(0)-x^(**)|| < r\left\|x_{0}-x^{*}\right\|<rx0x<r [8, Th.10.2.2] (see also [10, Th.4.4]):
lim sup k x k + 1 x x k x = 0 , ( assuming x k x for all k k 0 ) , lim sup k x k + 1 x x k x = 0 ,  assuming  x k x  for all  k k 0 , l i m   s u p_(k rarr oo)(||x_(k+1)-x^(**)||)/(||x_(k)-x^(**)||)=0,quad(" assuming "x_(k)!=x^(**)" for all "k >= k_(0)),\limsup _{k \rightarrow \infty} \frac{\left\|x_{k+1}-x^{*}\right\|}{\left\|x_{k}-x^{*}\right\|}=0, \quad\left(\text { assuming } x_{k} \neq x^{*} \text { for all } k \geq k_{0}\right),lim supkxk+1xxkx=0,( assuming xkx for all kk0),
also denoted by x k + 1 x = o ( x k x ) x k + 1 x = o x k x ||x_(k+1)-x^(**)||=o(||x_(k)-x^(**)||)\left\|x_{k+1}-x^{*}\right\|=o\left(\left\|x_{k}-x^{*}\right\|\right)xk+1x=o(xkx), as k k k rarr ook \rightarrow \inftyk.
In the practical applications, there are used mainly two models of perturbed Newton iterates: the quasi-Newton (QN) and the inexact Newton (IN) method
B k s k = F ( x k ) , F ( x k ) s k = F ( x k ) + r k . B k s k = F x k , F x k s k = F x k + r k . {:[B_(k)s_(k)=-F(x_(k))","],[F^(')(x_(k))s_(k)=-F(x_(k))+r_(k).]:}\begin{aligned} B_{k} s_{k} & =-F\left(x_{k}\right), \\ F^{\prime}\left(x_{k}\right) s_{k} & =-F\left(x_{k}\right)+r_{k} . \end{aligned}Bksk=F(xk),F(xk)sk=F(xk)+rk.
The first model assumes the use of perturbed Jacobians, while the second assumes that at each step the correction s k s k s_(k)s_{k}sk is only approximately determined (i.e., having nonzero residual r k r k r_(k)r_{k}rk ). The superlinear convergence (as well as the q q qqq-orders p ( 1 , 2 ] p ( 1 , 2 ] p in(1,2]p \in(1,2]p(1,2] ) of these methods was characterized by Dennis and Moré, respectively Dembo, Eisenstat and Steihaug.
Theorem 1.1 ([4]) Consider a sequence ( B k ) k 0 R n × n B k k 0 R n × n (B_(k))_(k >= 0)subR^(n xx n)\left(B_{k}\right)_{k \geq 0} \subset \mathbb{R}^{n \times n}(Bk)k0Rn×n of invertible matrices and an initial approximation x 0 D x 0 D x_(0)in Dx_{0} \in Dx0D. If the QN iterates converge to x x x^(**)x^{*}x, then they converge superlinearly iff
(DM) ( B k F ( x ) ) ( x k + 1 x k ) x k + 1 x k 0 , as k . (DM) B k F x x k + 1 x k x k + 1 x k 0 ,  as  k . {:(DM)(||(B_(k)-F^(')(x^(**)))(x_(k+1)-x_(k))||)/(||x_(k+1)-x_(k)||)rarr0","quad" as "k rarr oo.:}\begin{equation*} \frac{\left\|\left(B_{k}-F^{\prime}\left(x^{*}\right)\right)\left(x_{k+1}-x_{k}\right)\right\|}{\left\|x_{k+1}-x_{k}\right\|} \rightarrow 0, \quad \text { as } k \rightarrow \infty . \tag{DM} \end{equation*}(DM)(BkF(x))(xk+1xk)xk+1xk0, as k.
Theorem 1.2 ([3]) If the IN iterates converge to x x x^(**)x^{*}x, then the convergence is superlinear iff
(DES) r k = o ( F ( x k ) ) , as k . (DES) r k = o F x k ,  as  k . {:(DES)||r_(k)||=o(||F(x_(k))||)","quad" as "k rarr oo.:}\begin{equation*} \left\|r_{k}\right\|=o\left(\left\|F\left(x_{k}\right)\right\|\right), \quad \text { as } k \rightarrow \infty . \tag{DES} \end{equation*}(DES)rk=o(F(xk)), as k.
Denoting Δ k = B k F ( x k ) Δ k = B k F x k Delta_(k)=B_(k)-F^(')(x_(k))\Delta_{k}=B_{k}-F^{\prime}\left(x_{k}\right)Δk=BkF(xk), the quasi-Newton iterates are transcribed as (QN'):
( F ( x k ) + Δ k ) s k = F ( x k ) . F x k + Δ k s k = F x k . (F^(')(x_(k))+Delta_(k))s_(k)=-F(x_(k)).\left(F^{\prime}\left(x_{k}\right)+\Delta_{k}\right) s_{k}=-F\left(x_{k}\right) .(F(xk)+Δk)sk=F(xk).
The following two results show the connection between the conditions for the superlinear convergence of the two methods.
Theorem 1.3 ([2]) If the QN ' iterates are written as IN iterates,
F ( x k ) s k = F ( x k ) Δ k s k F x k s k = F x k Δ k s k F^(')(x_(k))s_(k)=-F(x_(k))-Delta_(k)s_(k)F^{\prime}\left(x_{k}\right) s_{k}=-F\left(x_{k}\right)-\Delta_{k} s_{k}F(xk)sk=F(xk)Δksk
then the (DES) condition for these resulted iterates is equivalent to the (DM) condition for the initial QN' iterates.
Let 2 2 ||*||_(2)\|\cdot\|_{2}2 denote the Euclidean norm.
Theorem 1.4 ([2]) If the IN are written as QN ' iterates,
( F ( x k ) 1 s k 2 2 r k s k t ) s k = F ( x k ) F x k 1 s k 2 2 r k s k t s k = F x k (F^(')(x_(k))-(1)/(||s_(k)||_(2)^(2))r_(k)s_(k)^(t))s_(k)=-F(x_(k))\left(F^{\prime}\left(x_{k}\right)-\frac{1}{\left\|s_{k}\right\|_{2}^{2}} r_{k} s_{k}^{t}\right) s_{k}=-F\left(x_{k}\right)(F(xk)1sk22rkskt)sk=F(xk)
then the (DM) condition for these resulted iterates is equivalent to the (DES) condition for the initial IN iterates.

2 The inexact perturbed Newton method

In [1] we have considered the inexact perturbed Newton (IPN) method and characterized its superlinear convergence:
( F ( x k ) + Δ k ) s k = ( F ( x k ) + δ k ) + r ^ k F x k + Δ k s k = F x k + δ k + r ^ k (F^(')(x_(k))+Delta_(k))s_(k)=(-F(x_(k))+delta_(k))+ hat(r)_(k)\left(F^{\prime}\left(x_{k}\right)+\Delta_{k}\right) s_{k}=\left(-F\left(x_{k}\right)+\delta_{k}\right)+\hat{r}_{k}(F(xk)+Δk)sk=(F(xk)+δk)+r^k
Previous results contained only sufficient conditions. The elements Δ k R n × n Δ k R n × n Delta_(k)inR^(n xx n)\Delta_{k} \in \mathbb{R}^{n \times n}ΔkRn×n represent perturbations to the Jacobians, δ k R n δ k R n delta_(k)inR^(n)\delta_{k} \in \mathbb{R}^{n}δkRn perturbations to the function evaluations, while r ^ k R n r ^ k R n hat(r)_(k)inR^(n)\hat{r}_{k} \in \mathbb{R}^{n}r^kRn are the residuals of the approximate solutions s k s k s_(k)s_{k}sk of the resulted perturbed linear systems. The perturbation terms may contain errors resulted from floating point representation/arithmetical operations, so this model may be conceived to contain all sources of perturbations. In [1] we have assumed that the perturbed Jacobians are invertible. This condition may be relaxed, and one can assume just that the perturbed linear systems are compatible (i.e., the iterations are well defined):
Theorem 2.1 ([2]) If the IPN iterates are well defined and converge to x x x^(**)x^{*}x, then the convergence is superlinear iff
Δ k s k + δ k + r ^ k = o ( F ( x k ) ) , as k Δ k s k + δ k + r ^ k = o F x k ,  as  k ||-Delta_(k)s_(k)+delta_(k)+ hat(r)_(k)||=o(||F(x_(k))||),quad" as "k rarr oo\left\|-\Delta_{k} s_{k}+\delta_{k}+\hat{r}_{k}\right\|=o\left(\left\|F\left(x_{k}\right)\right\|\right), \quad \text { as } k \rightarrow \inftyΔksk+δk+r^k=o(F(xk)), as k
As shown in [2], the (DM) and (DES) conditions may be retrieved from the above one; on the other hand, the IPN iterates may be regarded as IN iterates, so it follows that the conditions characterizing the high convergence orders of the QN, IN and IPN methods are equivalent.
The above result can be successfully applied to characterize the convergence of some Newton-Krylov methods. Let
A u = b , A R n × n nonsingular, b R n , A u = b , A R n × n  nonsingular,  b R n , Au=b,quad A inR^(n xx n)" nonsingular, "b inR^(n),A u=b, \quad A \in \mathbb{R}^{n \times n} \text { nonsingular, } b \in \mathbb{R}^{n},Au=b,ARn×n nonsingular, bRn,
and consider an initial approximation u 0 u 0 u_(0)u_{0}u0 to the exact solution u = A 1 b u = A 1 b u^(**)=A^(-1)bu^{*}=A^{-1} bu=A1b. Denote K m := span { r 0 , A r 0 , , A m 1 r 0 } K m := span r 0 , A r 0 , , A m 1 r 0 K_(m):=span{r_(0),Ar_(0),dots,A^(m-1)r_(0)}\mathcal{K}_{m}:=\operatorname{span}\left\{r_{0}, A r_{0}, \ldots, A^{m-1} r_{0}\right\}Km:=span{r0,Ar0,,Am1r0}, 1 m n 1 m n 1 <= m <= n1 \leq m \leq n1mn, where r 0 := b A u 0 r 0 := b A u 0 r_(0):=b-Au_(0)r_{0}:=b-A u_{0}r0:=bAu0.
The GMBACK algorithm [5] finds u m G B u 0 + K m u m G B u 0 + K m u_(m)^(GB)inu_(0)+K_(m)u_{m}^{G B} \in u_{0}+\mathcal{K}_{m}umGBu0+Km as argmin for (i.e., minimizes the backward error in A A AAA )
Δ m G B F = min u m u 0 + K m Δ m F w.r.t. ( A Δ m ) u m = b Δ m G B F = min u m u 0 + K m Δ m F  w.r.t.  A Δ m u m = b ||Delta_(m)^(GB)||_(F)=min_(u_(m)inu_(0)+K_(m))||Delta_(m)||_(F)quad" w.r.t. "(A-Delta_(m))u_(m)=b\left\|\Delta_{m}^{G B}\right\|_{F}=\min _{u_{m} \in u_{0}+\mathcal{K}_{m}}\left\|\Delta_{m}\right\|_{F} \quad \text { w.r.t. }\left(A-\Delta_{m}\right) u_{m}=bΔmGBF=minumu0+KmΔmF w.r.t. (AΔm)um=b
where F F ||*||_(F)\|\cdot\|_{F}F denotes the Frobenius norm. Depending on the parameters, the above problem may have a unique, several ones, or no solution at all. Therefore, Theorem 1.1 is not the best choice (whereas Theorem 2.1 is) for characterizing the superlinear convergence of the Newton-GMBACK iterates, when written in the QN form
( F ( x k ) Δ k , m k G B ) s k , m k G B = F ( x k ) F x k Δ k , m k G B s k , m k G B = F x k (F^(')(x_(k))-Delta_(k,m_(k))^(GB))s_(k,m_(k))^(GB)=-F(x_(k))\left(F^{\prime}\left(x_{k}\right)-\Delta_{k, m_{k}}^{G B}\right) s_{k, m_{k}}^{G B}=-F\left(x_{k}\right)(F(xk)Δk,mkGB)sk,mkGB=F(xk)
The same holds for the MINPERT method [6] which finds u m M P u m M P u_(m)^(MP)u_{m}^{M P}umMP as argmin for (i.e., minimizes the joint backward error)
[ Δ m M P δ m M P ] F = min x m x 0 + K m [ Δ m δ m ] F w.r.t. ( A Δ m ) x m = b + δ m Δ m M P δ m M P F = min x m x 0 + K m Δ m δ m F  w.r.t.  A Δ m x m = b + δ m ||[Delta_(m)^(MP)delta_(m)^(MP)]||_(F)=min_(x_(m)inx_(0)+K_(m))||[Delta_(m)delta_(m)]||_(F)quad" w.r.t. "(A-Delta_(m))x_(m)=b+delta_(m)\left\|\left[\Delta_{m}^{M P} \delta_{m}^{M P}\right]\right\|_{F}=\min _{x_{m} \in x_{0}+\mathcal{K}_{m}}\left\|\left[\Delta_{m} \delta_{m}\right]\right\|_{F} \quad \text { w.r.t. }\left(A-\Delta_{m}\right) x_{m}=b+\delta_{m}[ΔmMPδmMP]F=minxmx0+Km[Δmδm]F w.r.t. (AΔm)xm=b+δm
Acknowledgements I would like to express my gratitude to the organizers, who made possible my participation to the conference.

References

[1] E. Cătinaş, Inexact perturbed Newton methods and applications to a class of Krylov solvers, J. Optim. Theory Appl. 108 (2001), 543-570.
[2] E. Cătinaş, The inexact, inexact perturbed and quasi-Newton methods are equivalent models, Math. Comp., 74 (2005) no. 249, pp. 291-301.
[3] R. S. Dembo, S. C. Eisenstat and T. Steihaug, Inexact Newton methods, SIAM J. Numer. Anal. 19 (1982), 400-408.
[4] J. E. Dennis, Jr. and J. J. Moré, A characterization of superlinear convergence and its application to quasi-Newton methods, Math. Comp. 28 (1974), 549-560.
[5] E. M. Kasenally, GMBACK: a generalised minimum backward error algorithm for nonsymmetric linear systems, SIAM J. Sci. Comput. 16 (1995), 698-719.
[6] E. M. Kasenally and V. Simoncini, Analysis of a minimum perturbation algorithm for nonsymmetric linear systems, SIAM J. Numer. Anal. 34 (1997), 48-66.
[7] C. T. Kelley, Iterative Methods for Linear and Nonlinear Equations, SIAM, Philadelphia, 1995.
[8] J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970.
[9] F. A. Potra, Q Q QQQ-superlinear convergence of the iterates in primal-dual interior-point methods, Math. Progr., 91 no. 1, Ser. A, pp. 99-115, 2001.
[10] W. C. Rheinboldt, Methods for Solving Systems of Nonlinear Equations, Second ed., SIAM, Philadelphia, PA, 1998.

2005

Related Posts