No q-superlinear convergence to a fixed point \(x^\ast\) of a nonlinear mapping \(G\) may be attained by the successive approximations when \(G^\prime(x^\ast)\) has no eigenvalue equal to 0, as shown by us in [8].
However, high q-convergence orders may be attained if one considers perturbed successive approximations.
We characterize the correction terms which must be added at each step in order to obtain convergence with q-order 2 of the resulted iterates.
Authors
Emil Cătinaş
(Tiberiu Popoviciu Institute of Numerical Analysis, Romanian Academy)
Keywords
fixed point problems; acceleration of convergence; nonlinear system of equations in Rn; inexact Newton method; linear systems of equation in Rn; residual; local convergence; q-convergence order.
[1] I. Argyros, F. Szidarovszky, The Theory and Applications of Iteration Methods, CRC Press, Boca Raton, 1993.
[2] I. Argyros, On the convergence of the modified contractions, J. Comp. Appl. Math., 55(1994), 183–189.
[3] E. Catinas, Newton and Newton-Krylov methods for solving nonlinear systems in Rn, PhD Thesis, Babes-Bolyai University of Cluj-Napoca, Cluj-Napoca, Romania, 1999.
[4] E. Catinas, On the high convergence orders of the Newton-GMBACK methods, Rev. Anal. Numer. Theor. Approx., 28 (1999) no. 2, 125-132.
[5] E. Catinas, A note on the quadratic convergence of the inexact Newton methods, Rev. Anal. Numer. Theor. Approx. 29 (2000) no. 2, 129-133.
[6] E. Catinas, Inexact perturbed Newton methods and applications to a class of Krylov solvers, J. Optim. Theory Appl., 108 (2001) no. 3, 543-570.
[7] E. Catinas, The inexact, inexact perturbed and quasi-Newton methods are equivalent models, submitted.
[8] E. Catinas, On the superlinear convergence of the successive approximations method, submitted.
[9] R.S. Dembo, S.C. Eisenstat, T. Steihaug, Inexact Newton methods, SIAM J. Numer. Anal., 19 (1982), 400-408.
[10] J.E. Dennis, Jr., J. J. More, A characterization of superlinear convergence and its application to quasi-Newton methods, Math. Comp., 28 (1974), 549-560.
[11] J.E. Dennis, Jr., R.B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice-Hall Series in Computational Mathematics, Engle-wood Cliffs, 1983.
[12] P. Deuflhard, F. A. Potra, Asymptotic mesh independence of Newton-Galerkin methods via a refined Mysovskii theorem, SIAM J. Numer. Anal.,29 (1992), 1395-1412.
[13] S.C. Eisenstat, H.F. Walker, Choosing the forcing terms in an inexact Newton method, SIAM J. Sci. Comput.,17(1996), 16-32.
[14] Emil Catinas, N.J. Higham, Accuracy and Stability of Numerical Algorithms, SIAM, Philadelphia,1996.
[15] V.I. Istratescu, Introduction to the Fixed Points Theory, Editura Academiei RSR, Bucharest, Romania, 1973 (in Romanian).
[16] C.T. Kelley, Iterative Methods for Linear and Nonlinear Equations, SIAM, Philadelphia, Pennsylvania, 1995.
[17] St. Maruster, Quasi-nonexpansivity and two classical methods for solving nonlinear equations, Proc. AMS, 62 (1977), 119-123
[18] St. Maruser, Numerical Methods for Solving Nonlinear Equations, Editura Tehnica, Bucharest, Romania, 1981 (in Romanian).
[19] J.M. Ortega, W.C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, New York, 1970.
[20] A.M. Ostrowski, Solution of Equations and Systems of Equations, Academic Press, New York, 1966.
[21] I.Pavaloiu, Introduction to the Theory of Approximating the Solutions of Equations, Editura Dacia, Cluj-Napoca, Romania, 1976 (in Romanian).
[22] F.A. Potra, V. Ptak, Nondiscrete Induction and Iterative Processes, Pitman, London,1984.
[23] F.A. Potra, On Q-order and R-order of convergence, J. Optim. Theory Appl., 63 (1989), 415–431.
[24] F.A. Potra, Q-superlinear convergence of the iterates in primal-dual interior-point methods, Math. Progr., to appear.
[25] W.C. Rheinboldt, Methods for Solving Systems of Nonlinear Equations, SIAM, Philadelphia, 1998.
[26] H.F. Walker, An approach to continuation using Krylov subspace methods, Computational Science in the 21st Century, M.-O. Bristeau, G. Etgen, W. Fitzgibbon, J. L.Lions, J. Periaux and M. F. Wheeler, editors, John Wiley and Sons, Ltd., 72-82, 1997
ACCELERATING THE CONVERGENCE OF THE SUCCESSIVE APPROXIMATIONS*
EMIL CĂTINAŞ ^(†){ }^{\dagger}
Abstract
In a previous paper of us, we have shown that no qq-superlinear convergence to a fixed point x^(**)x^{*} of a nonlinear mapping GG may be attained by the successive approximations when G^(')(x^(**))G^{\prime}\left(x^{*}\right) has no eigenvalue equal to 0 . However, high convergence orders may theoretically be attained if one considers perturbed successive approximations.
We characterize here the correction terms which must be added at each step in order to obtain convergence with qq-order 2 of the resulted iterates.
to a fixed point x^(**)in int(D)x^{*} \in \operatorname{int}(D) of the nonlinear mapping G:D subeR^(n)rarr DG: D \subseteq \mathbb{R}^{n} \rightarrow D. A classical result on the local convergence of these sequences is given by the Ostrowski theorem. First we remind the definitions of the convergence orders.
Let ||*||\|\cdot\| denote a given norm on R^(n)\mathbb{R}^{n}.
Definition 1. [19, ch. 9] Let (x_(k))_(k >= 0)subR^(n)\left(x_{k}\right)_{k \geq 0} \subset \mathbb{R}^{n} be an arbitrary sequence converging to some x^(**)inR^(n)x^{*} \in \mathbb{R}^{n}. The quotient and the root convergence factors are defined for each alpha in[1,+oo)\alpha \in[1,+\infty) as
{:[Q_(alpha){x_(k)}={[0","," if "x_(k)=x^(**)","" for all but finitely many "k","],[l i m s u p_(k rarr oo)(||x_(k+1)-x^(**)||)/(||x_(k)-x^(**)||^(alpha))","," if "x_(k)!=x^(**)","" for all but finitely many "k","],[+oo","," otherwise "","]:}],[R_(alpha){x_(k)}={[l i m s u p_(k rarr oo)||x_(k)-x^(**)||^(1//alpha^(k))","," when "alpha > 1","],[l i m s u p_(k rarr oo)||x_(k)-x^(**)||^(1//k)","," when "alpha=1.]:}]:}\begin{aligned}
& Q_{\alpha}\left\{x_{k}\right\}= \begin{cases}0, & \text { if } x_{k}=x^{*}, \text { for all but finitely many } k, \\
\limsup _{k \rightarrow \infty} \frac{\left\|x_{k+1}-x^{*}\right\|}{\left\|x_{k}-x^{*}\right\|^{\alpha}}, & \text { if } x_{k} \neq x^{*}, \text { for all but finitely many } k, \\
+\infty, & \text { otherwise },\end{cases} \\
& R_{\alpha}\left\{x_{k}\right\}= \begin{cases}\limsup _{k \rightarrow \infty}\left\|x_{k}-x^{*}\right\|^{1 / \alpha^{k}}, & \text { when } \alpha>1, \\
\limsup _{k \rightarrow \infty}\left\|x_{k}-x^{*}\right\|^{1 / k}, & \text { when } \alpha=1 .\end{cases}
\end{aligned}
The qq - and rr-convergence orders are defined by
{:[O_(Q){x_(k)}={[+oo","quad" if "Q_(alpha){x_(k)}=0","quad AA alpha in[1","+oo)","],[i n f{alpha in[1,+oo):Q_(alpha){x_(k)}=+oo}","quad" otherwise "]:}],[O_(R){x_(k)}={[+oo","quad" if "R_(alpha){x_(k)}=0","quad AA alpha in[1","+oo)","],[i n f{alpha in[1,+oo):R_(alpha){x_(k)}=1}","quad" otherwise "]:}]:}\begin{aligned}
O_{Q}\left\{x_{k}\right\} & =\left\{\begin{array}{l}
+\infty, \quad \text { if } Q_{\alpha}\left\{x_{k}\right\}=0, \quad \forall \alpha \in[1,+\infty), \\
\inf \left\{\alpha \in[1,+\infty): Q_{\alpha}\left\{x_{k}\right\}=+\infty\right\}, \quad \text { otherwise }
\end{array}\right. \\
O_{R}\left\{x_{k}\right\} & =\left\{\begin{array}{l}
+\infty, \quad \text { if } R_{\alpha}\left\{x_{k}\right\}=0, \quad \forall \alpha \in[1,+\infty), \\
\inf \left\{\alpha \in[1,+\infty): R_{\alpha}\left\{x_{k}\right\}=1\right\}, \quad \text { otherwise }
\end{array}\right.
\end{aligned}
When Q_(1){x_(k)}=0Q_{1}\left\{x_{k}\right\}=0 or R_(1){x_(k)}=0R_{1}\left\{x_{k}\right\}=0, the sequence converges qq-, resp. rr superlinearly; if Q_(alpha_(0)){x_(k)} < +ooQ_{\alpha_{0}}\left\{x_{k}\right\}<+\infty for some alpha_(0) > 1\alpha_{0}>1, one may write
||x_(k+1)-x^(**)||=O(||x_(k)-x^(**)||^(alpha_(0))),quad" as "k rarr oo.\left\|x_{k+1}-x^{*}\right\|=\mathcal{O}\left(\left\|x_{k}-x^{*}\right\|^{\alpha_{0}}\right), \quad \text { as } k \rightarrow \infty .
The qq-convergence rates require conditions stronger than for the rr-convergence rates: the qq-convergence with a certain order implies rr-convergence with at least the same order, the converse being false. We refer the reader to [19, ch. 9] and [23] (see also [25] ch. 3] and [24]) for other different relating results.
The fixed point x^(**)x^{*} is an attraction fixed point if there exists an open ball with center at x^(**)x^{*} such that for any initial approximation x_(0)x_{0} from that ball, the sequence (1) converges to x^(**)x^{*}. We shall denote by S\mathcal{S} the set of all such sequences.
The qq - and rr-factors of the iterative process S\mathcal{S} are then defined as
the convergence orders being similarly defined as for a single sequence.
Now we can state the following classical result (see also [25, Th. 3.5]).
Theorem 1 (Ostrowski). [20, Th. 22.1], [19, Thms. 10.1.3, 10.1.4] Assume that the mapping GG is differentiable at the fixed point x^(**)in int(D)x^{*} \in \operatorname{int}(D). If the spectral radius of G^(')(x^(**))G^{\prime}\left(x^{*}\right) satisfies
then x^(**)x^{*} is an attraction fixed point. Moreover, R_(1)(S)=sigmaR_{1}(\mathcal{S})=\sigma, and if sigma > 0\sigma>0 then O_(R)(S)=O_(Q)(S)=1O_{R}(\mathcal{S})=O_{Q}(\mathcal{S})=1.
According to this theorem, when sigma=0\sigma=0, all the sequences from S\mathcal{S} converge rr-superlinearly; however, this does not imply that they also converge qq-superlinearly (an example is given in [19, E10.1-6]; see also [25, p. 30]). A sufficient condition for qq-superlinear convergence of S\mathcal{S} is that G^(')(x^(**))=0G^{\prime}\left(x^{*}\right)=0 [19, Th. 10.1.6] (see also [25, p. 30]). The qq-convergence order of S\mathcal{S} is even higher under some supplementary smoothness conditions: if GG is continuously differentiable on an open neighborhood of the fixed point x^(**)in int(D),Gx^{*} \in \operatorname{int}(D), G is twice differentiable at x^(**)x^{*} and G^(')(x^(**))=0G^{\prime}\left(x^{*}\right)=0, then O_(R)(S) >= O_(Q)(S) >= 2O_{R}(\mathcal{S}) \geq O_{Q}(\mathcal{S}) \geq 2 [19, Th. 10.1.7] (see also [25, Th. 3.6]).
These sufficient conditions (which are not also necessary) ensure the high convergence orders of all the successive approximations near the fixed point, but the restrictions on G^(')G^{\prime} are rather strong. In our paper [8] we have characterized the high convergence orders of only one sequence converging to x^(**)x^{*}. We shall consider here only the qq-order 2 .
Theorem 2. [8] Assume that the mapping GG is differentiable on an open neighborhood of the fixed point x^(**)x^{*}, with G^(')G^{\prime} Lipschitz continuous at x^(**)x^{*}, i.e. there exists L,epsi > 0L, \varepsilon>0 such that
||G^(')(x)-G^(')(x^(**))|| <= L||x-x^(**)||,quad" when "||x-x^(**)|| < epsi.\left\|G^{\prime}(x)-G^{\prime}\left(x^{*}\right)\right\| \leq L\left\|x-x^{*}\right\|, \quad \text { when }\left\|x-x^{*}\right\|<\varepsilon .
Suppose also that sigma=rho(G^(')(x^(**))) < 1\sigma=\rho\left(G^{\prime}\left(x^{*}\right)\right)<1. Let x_(0)in Dx_{0} \in D be an initial approximation such that the sequence of successive approximations converges to x^(**)x^{*}. Then (x_(k))_(k >= 0)\left(x_{k}\right)_{k \geq 0} converges with qq-order 2 iff
{:(2)||G^(')(x_(k))(x_(k)-G(x_(k)))||=O(||x_(k)-G(x_(k))||^(2))","quad" as "k rarr oo:}\begin{equation*}
\left\|G^{\prime}\left(x_{k}\right)\left(x_{k}-G\left(x_{k}\right)\right)\right\|=\mathcal{O}\left(\left\|x_{k}-G\left(x_{k}\right)\right\|^{2}\right), \quad \text { as } k \rightarrow \infty \tag{2}
\end{equation*}
This result allowed us to show that condition (2) is in fact equivalent to
G^(')(x^(**))(x_(k+1)-x_(k))=0,quad AA k >= k_(0),G^{\prime}\left(x^{*}\right)\left(x_{k+1}-x_{k}\right)=0, \quad \forall k \geq k_{0},
i.e., from a certain step, the corrections x_(k+1)-x_(k)x_{k+1}-x_{k} are eigenvectors correspond-
ing to the eigenvalue 0 of G^(')(x^(**))G^{\prime}\left(x^{*}\right). As a direct consequence, it followed that the trajectories with high convergence orders are restricted to affine subspaces.
Apart from the instability in the presence of errors implied by this result, it also means bad news when G^(')(x^(**))G^{\prime}\left(x^{*}\right) is invertible (i.e. all its eigenvalues are nonzero), since no trajectory may attain high convergence rates.
We are interested in accelerating the convergence of the successive approximations in the case 0 < sigma < 10<\sigma<1 (or even when G^(')(x^(**))G^{\prime}\left(x^{*}\right) is nonsingular) by adding some correction terms delta_(k)inR^(n)\delta_{k} \in \mathbb{R}^{n}, i.e., by considering the sequence
In [8] we have characterized the high convergence orders of this sequence, but delta_(k)\delta_{k} were viewed as error terms, and the above sequence was called as perturbed successive approximations. We shall present a new result which allows (at least theoretically) the computation of some terms delta_(k)\delta_{k} leading to qq-quadratic convergence of the iterations (3).
2. ACCELERATED CONVERGENCE OF THE SUCCESSIVE APPROXIMATIONS
We have obtained the following result:
Theorem 3. [8] Suppose that GG satisfies the assumptions of Theorem 2, and that the sequence (3) of perturbed successive approximations converges to x^(**)x^{*}. Then the convergence is with qq-order 2 iff
{:(4)||G^(')(x_(k))(x_(k)-G(x_(k)))+(I-G^(')(x_(k)))delta_(k)||=O(||x_(k)-G(x_(k))||^(2))","" as "k rarr oo.:}\begin{equation*}
\left\|G^{\prime}\left(x_{k}\right)\left(x_{k}-G\left(x_{k}\right)\right)+\left(I-G^{\prime}\left(x_{k}\right)\right) \delta_{k}\right\|=\mathcal{O}\left(\left\|x_{k}-G\left(x_{k}\right)\right\|^{2}\right), \text { as } k \rightarrow \infty . \tag{4}
\end{equation*}
We note that this result no longer requires G^(')(x^(**))G^{\prime}\left(x^{*}\right) to be singular.
We obtain an equivalent form of condition (4) in the following result:
Theorem 4. Suppose that GG satisfies the assumptions of Theorem 2, that the sequence (x_(k))_(k >= 0)\left(x_{k}\right)_{k \geq 0} given by (3) converges to x^(**)x^{*}, and that the matrices I-G^(')(x_(k))I- G^{\prime}\left(x_{k}\right) are invertible starting from a certain step. Then the convergence is with qq-order 2 iff
where (gamma_(k))_(k >= 0)subR^(n)\left(\gamma_{k}\right)_{k \geq 0} \subset \mathbb{R}^{n} is an arbitrary sequence converging to zero with gamma_(k)=O(||x_(k)-G(x_(k))||^(2))\gamma_{k}= \mathcal{O}\left(\left\|x_{k}-G\left(x_{k}\right)\right\|^{2}\right), as k rarr ook \rightarrow \infty.
Proof. We can easily obtain this result from the previous theorem by denoting
and then computing delta_(k)\delta_{k}.
Under the assumption that ||G^(')(x)|| < q < 1\left\|G^{\prime}(x)\right\|<q<1 for all xx in a certain neighborhood of x^(**)x^{*}, a natural choice (implied by the Banach lemma) for delta_(k)\delta_{k} is
for some fixed K > 0K>0.
This choice may be useful when the powers of G^(')(x)G^{\prime}(x) and their sums are computationally inexpensive ( G^(')(x)G^{\prime}(x) is sparse, etc.). However, it is known that for a matrix A inR^(n xx n)A \in \mathbb{R}^{n \times n} with rho(A) < 1\rho(A)<1, additional conditions are needed in order that A^(k)rarr0A^{k} \rightarrow 0 in floating point arithmetic (see [14, ch. 17] for a discussion of this topic).
Also, the local convergence of these iterations remains to be studied.
*This work has been supported by the Romanian Academy of Sciences under Grant GAR 64/2001. ^(†){ }^{\dagger} "T. Popoviciu" Institute of Numerical Analysis, P.O. Box 68-1, 3400 Cluj-Napoca, Romania (ecatinas@ictp.acad.ro).