We show that a new sufficient condition for the convergence with q-order two of the inexact Newton iterates may be obtained by considering the normwise backward error of the approximate steps and a result on perturbed Newton methods. This condition is in fact equivalent to the characterization given by Dembo, Eisenstat and Steihaug.
Authors
Emil Cătinaş
(Tiberiu Popoviciu Institute of Numerical Analysis, Romanian Academy)
Keywords
nonlinear system of equations in Rn; inexact Newton method; residual; local convergence; forcing term; q-convergence order.
[1] E. Catinas, Inexact perturbed Newton methods, and some applications for a class of Krylov solvers, J. Optim. Theory Appl.,108 (2001) no. 3, pp. 543–570.
[2] E. Catinas, On the high convergence orders of the Newton-GMBACK methods, Rev. Anal. Num ́er. Th ́eor. Approx., 28 (1999) no. 2, pp. 125–132.
[3] E. Catinas, Newton and Newton-Krylov methods for solving nonlinear systems in Rn, Ph.D. thesis, ”Babes-Bolyai” University of Cluj–Napoca, Romania, 1999.
[4] E. Catinas, The relationship between three practical models of Newton methods with high convergence orders , submitted.
[5] E. Catinas, Inexact perturbed Newton methods for nonlinear systems with singular Jacobians, submitted.
[6] E. Catinas, The high convergence orders of the successive approximations, submitted.
[7] E. Catinas, Finite difference approximations in the Newton-Krylov methods, manuscript.
[8] D. Cores and R.A. Tapia, Perturbation Lemma for the Newton Method with Application to the SQP Newton Method, J. Optim. Theory Appl., 97 (1998), pp. 271–280.
[9] R. S. Dembo, S. C. Eisenstat and T. Steihaug, Inexact Newton methods, SIAM J. Numer. Anal., 19 (1982), pp. 400–408.
[10] J. E. Dennis, Jr. and R. B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice–Hall Series in Computational Mathematics, Englewood Cliffs, NJ, 1983.
[11] N. J. Higham, Accuracy and Stability of Numerical Algorithms, SIAM, Philadelphia, PA, 1996.
[12] St. Maruster, Numerical Methods for Solving Nonlinear Equations, Ed. Tehnica, Bucharest, 1981 (in Romanian).
[13] J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970. 5 Quadratic convergence of the inexact Newton methods 133
[14] I. Pavaloiu, Introduction to the Approximation of the Solutions of Equations, Ed. Dacia, Cluj-Napoca, 1976 (in Romanian).
[15] F. A. Potra, On Q-order and R-order of convergence, J. Optim. Theory Appl., 63 (1989), pp. 415–431.
[16] W. C. Rheinboldt, Methods for Solving Systems of Nonlinear Equations, Second ed., SIAM, Philadelphia, 1998.
[17] J. L. Rigal and J. Gaches, On the compatibility of a given solution with the data of a linear system, J. ACM, 14 (1967), pp. 543–548.
[18] H. F. Walker, An approach to continuation using Krylov subspace methods, Research Report 1/97/89, Dept. of Math., Utah State University, appeared in ComputationalScience in the 21st Century, M.–O. Bristeau, G. Etgen, W. Fitzgibbon, J. L. Lions, J.
Periaux and M. F. Wheeler, eds., John Wiley and Sons, Ltd., 1997, pp. 72–82
A NOTE ON THE QUADRATIC CONVERGENCE OF THE INEXACT NEWTON METHODS*
EMIL CĂTINAŞ
Abstract
We show that a new sufficient condition for the convergence with qq-order two of the inexact Newton iterates may be obtained by considering the normwise backward error of the approximate steps and a result on perturbed Newton methods.
This condition is in fact equivalent to the characterization given by Dembo, Eisenstat and Steihaug.
AMS Subject Classification: 65 H 10 .
1. INTRODUCTION
The inexact Newton (IN) method is given by the algorithm
Choose an initial approximation y_(0)in Dy_{0} \in D
For k=0,1,dotsk=0,1, \ldots until "convergence" do
Find s_(k)s_{k} such that F^(')(y_(k))s_(k)=-F(y_(k))+r_(k)F^{\prime}\left(y_{k}\right) s_{k}=-F\left(y_{k}\right)+r_{k}
Set y_(k+1)=y_(k)+s_(k)y_{k+1}=y_{k}+s_{k},
and it constitutes a classical model for the practical solving of nonlinear systems F(y)=0F(y)=0 by the Newton method, where F:D subeR^(n)rarrR^(n)F: D \subseteq \mathbb{R}^{n} \rightarrow \mathbb{R}^{n}. The error terms (the residuals) r_(k)r_{k} represent the amounts by which the approximate solutions s_(k)s_{k} fail to satisfy the exact linear systems F^(')(y_(k))s=-F(y_(k))F^{\prime}\left(y_{k}\right) s=-F\left(y_{k}\right).
Under certain conditions, the IN iterates converge to a solution y^(**)y^{*} of the mentioned nonlinear system, the convergence order being given by the magnitude of the residuals (see [9]).
We are interested in the high convergence orders of the iterates, namely in the convergence with qq-order two (for the definitions of the convergence orders see [13, ch.9], and also [16], [15]). The (standard) assumptions on FF for this case are the following:
there exists y^(**)in Dy^{*} \in D such that F(y^(**))=0F\left(y^{*}\right)=0;
the mapping FF is differentiable on a neighborhood of y^(**)y^{*}, with the derivative F^(')F^{\prime} Lipschitz continuous at y^(**)⊴^(1)y^{*} \unlhd^{1}
||F^(')(y)-F^(')(y^(**))|| <= L||y-y^(**)||,quad" for some "L >= 0," when "||y-y^(**)|| < epsi;\left\|F^{\prime}(y)-F^{\prime}\left(y^{*}\right)\right\| \leq L\left\|y-y^{*}\right\|, \quad \text { for some } L \geq 0, \text { when }\left\|y-y^{*}\right\|<\varepsilon ;
the Jacobian F^(')(y^(**))F^{\prime}\left(y^{*}\right) is nonsingular.
Dembo, Eisenstat and Steihaug proved the following result.
Theorem 1. [9] Suppose that F obeys the standard assumptions and that for some initial approximation y_(0)in Dy_{0} \in D, the sequence (y_(k))_(k >= 0)\left(y_{k}\right)_{k \geq 0} given by the IN method converges to y^(**)y^{*}. Then its qq-convergence order is at least two if and only if ||r_(k)||=O(||F(y_(k))||^(2))\left\|r_{k}\right\|=\mathcal{O}\left(\left\|F\left(y_{k}\right)\right\|^{2}\right) as k rarr ook \rightarrow \infty, or, equivalently,
{:(1)(||r_(k)||)/(||F(y_(k))||)=O(||F(y_(k))||)","quad" as "k rarr oo.:}\begin{equation*}
\frac{\left\|r_{k}\right\|}{\left\|F\left(y_{k}\right)\right\|}=\mathcal{O}\left(\left\|F\left(y_{k}\right)\right\|\right), \quad \text { as } k \rightarrow \infty . \tag{1}
\end{equation*}
In our recent paper [1] (see also [3]) we have introduced the inexact perturbed Newton methods, characterizing their convergence orders in terms of perturbations and residuals. In the following analysis we shall consider the perturbed Newton (PN) iterates:
The matrices Delta_(k)\Delta_{k} and the vectors delta_(k)\delta_{k} are some arbitrary perturbations to the linear systems F^(')(y_(k))s=-F(y_(k))F^{\prime}\left(y_{k}\right) s=-F\left(y_{k}\right), the perturbed linear systems being verified by the exact solutions s_(k)s_{k} (it is implicitely assumed that the perturbed Jacobians F^(')(y_(k))+Delta_(k)F^{\prime}\left(y_{k}\right)+\Delta_{k} are invertible for k=0,1,dotsk=0,1, \ldots ). We have obtained the following result: ^(2){ }^{2}
Corollary 2. [1, [8] Suppose that F obeys the standard assumptions and that for some initial approximation y_(0)in Dy_{0} \in D, the sequence (y_(k))_(k >= 0)\left(y_{k}\right)_{k \geq 0} given by the PN method is well defined and converges to y^(**)y^{*}. If
||Delta_(k)||=O(||F(y_(k))||)" and "||delta_(k)||=O(||F(y_(k))||^(2)),quad" as "k rarr oo,\left\|\Delta_{k}\right\|=\mathcal{O}\left(\left\|F\left(y_{k}\right)\right\|\right) \text { and }\left\|\delta_{k}\right\|=\mathcal{O}\left(\left\|F\left(y_{k}\right)\right\|^{2}\right), \quad \text { as } k \rightarrow \infty,
then (y_(k))_(k >= 0)\left(y_{k}\right)_{k \geq 0} converges with qq-order at least two.
In the following section we shall show that the sufficient condition (1) may be expressed in an equivalent form.
2. MAIN RESULT
Let A inR^(n xx n)A \in \mathbb{R}^{n \times n} be a nonsingular matrix, b inR^(n)b \in \mathbb{R}^{n} an arbitrary vector, and consider an approximate solution tilde(x)inR^(n)\tilde{x} \in \mathbb{R}^{n} of the linear system Ax=bA x=b. The normwise backward error of tilde(x)\tilde{x} was introduced by Rigal and Gaches [17] and is defined by: Pi( tilde(x))=min{epsi:(A+Delta_(A))( tilde(x))=b+Delta_(b),||Delta_(A)||_(F) <= epsi||E||_(F),||Delta_(b)||_(2) <= epsi||f||_(2)}\Pi(\tilde{x})=\min \left\{\varepsilon:\left(A+\Delta_{A}\right) \tilde{x}=b+\Delta_{b},\left\|\Delta_{A}\right\|_{F} \leq \varepsilon\|E\|_{F},\left\|\Delta_{b}\right\|_{2} \leq \varepsilon\|f\|_{2}\right\}, where the parameters E inR^(n xx n)E \in \mathbb{R}^{n \times n} and f inR^(n)f \in \mathbb{R}^{n} are arbitrary, and ||*||_(F)\|\cdot\|_{F} denotes the Frobenius norm: ||Z||_(F)=(sum_(i,j=1,dots,n)z_(ij)^(2))^(1//2)\|Z\|_{F}=\left(\sum_{i, j=1, \ldots, n} z_{i j}^{2}\right)^{1 / 2}. The value of Pi( tilde(x))\Pi(\tilde{x}) is
With these relations we can prove the following result.
Theorem 3. Suppose that FF obeys the standard assumptions and that for some initial approximation y_(0)in Dy_{0} \in D, the sequence (y_(k))_(k >= 0)\left(y_{k}\right)_{k \geq 0} given by the IN method converges to y^(**)y^{*}. If
{:(2)(||r_(k)||_(2))/(||s_(k)||_(2)+||F(y_(k))||_(2))=O(||F(y_(k))||_(2))","quad" as "k rarr oo:}\begin{equation*}
\frac{\left\|r_{k}\right\|_{2}}{\left\|s_{k}\right\|_{2}+\left\|F\left(y_{k}\right)\right\|_{2}}=\mathcal{O}\left(\left\|F\left(y_{k}\right)\right\|_{2}\right), \quad \text { as } k \rightarrow \infty \tag{2}
\end{equation*}
then the iterates converge with qq-order at least two.
Proof. For each kk, consider the normwise backward errors of the approximate steps s_(k)s_{k}, choosing the parameters EE and ff such that ||E||_(F)=||F(y_(k))||_(2)\|E\|_{F}=\left\|F\left(y_{k}\right)\right\|_{2} and ||f||_(2)=||F(y_(k))||_(2)^(2)\|f\|_{2}=\left\|F\left(y_{k}\right)\right\|_{2}^{2}. Corollary 2 and the inequality ||Z||_(2) <= ||Z||_(F)\|Z\|_{2} \leq\|Z\|_{F}, true for all Z inR^(n xx n)Z \in \mathbb{R}^{n \times n} (cf., e.g., [11]), lead to the stated result, provided that the epsi\varepsilon 's are uniformly bounded. This condition is written as
Pi(s_(k))=(||r_(k)||_(2))/(||s_(k)||_(2)*||F(y_(k))||_(2)+||F(y_(k))||_(2)^(2)) <= K,quad" for some fixed "K > 0\Pi\left(s_{k}\right)=\frac{\left\|r_{k}\right\|_{2}}{\left\|s_{k}\right\|_{2} \cdot\left\|F\left(y_{k}\right)\right\|_{2}+\left\|F\left(y_{k}\right)\right\|_{2}^{2}} \leq K, \quad \text { for some fixed } K>0
which is exactly relation (2).
Remarks. 1. Condition (2) is in fact equivalent to condition (1). Indeed, Theorem 3 shows that relation (2) implies the quadratic convergence. For the converse, assuming that the IN iterates converge quadratically, we notice that
which shows that quadratic convergence implies (2) (the right inequality above is ensured by Theorem 1 ).
2. It is easy to prove that the Euclidean norm from (2) may be replaced by an arbitrary norm on R^(n)\mathbb{R}^{n}, since all norms are equivalent on a finite dimensional normed space.
3. Condition (2) has been obtained in a natural fashion by considering the normwise backward errors of the approximate steps. One can also obtain it taking into account that under the standard assumptions, the sequences (y_(k)-y^(**))_(k >= 0)\left(y_{k}-y^{*}\right)_{k \geq 0} and (s_(k))_(k >= 0)\left(s_{k}\right)_{k \geq 0} converge quadratically to zero only at the same time, with lim_(k rarr oo)||y_(k)-y^(**)||//||y_(k+1)-y_(k)||=1\lim _{k \rightarrow \infty}\left\|y_{k}-y^{*}\right\| /\left\|y_{k+1}-y_{k}\right\|=1 (see [18]). The same situation appears concerning (y_(k)-y^(**))_(k >= 0)\left(y_{k}-y^{*}\right)_{k \geq 0} and (F(y_(k)))_(k >= 0)\left(F\left(y_{k}\right)\right)_{k \geq 0}, since the standard assumptions ensure that there exists alpha > 0\alpha>0 such that ||y-y^(**)||//alpha <= ||F(y)|| <= alpha||y-y^(**)||\left\|y-y^{*}\right\| / \alpha \leq\|F(y)\| \leq \alpha\left\|y-y^{*}\right\| when yy is sufficiently close to y^(**)y^{*} (see [9]).
Received June 7, 1999
"T. Popoviciu" Institute of Numerical Analysis
Romanian Academy
P.O. Box 68, 3400 Cluj-Napoca 1
Romania
*Work supported by the National Agency for Science, Technology and Innovation under grant GANSTI 6100 GR/2000.
^(1){ }^{1} We shall use the symbols ||*||\|\cdot\| and ||*||_(2)\|\cdot\|_{2} for an arbitrary, resp. for the Euclidean norm on R^(n)\mathbb{R}^{n}, and for their induced operator norms. ^(2){ }^{2} The same result can be found in [8]; we have obtained independently this Corollary in the manuscript of 1 , before 8 was published.