Abstract
The Ostrowski theorem is a classical result which ensures the attraction of all the successive approximations xk+1 = G(xk) near a fixed point x*. Different conditions [ultimately on the magnitude of G′(x*)] provide lower bounds for the convergence order of the process as a whole.
In this paper, we consider only one such sequence and we characterize its high q-convergence orders in terms of some spectral elements of G′(x*); we obtain that the set of trajectories with high q-convergence orders is restricted to some affine subspaces, regardless of the nonlinearity of G.
We analyze also the stability of the successive approximations under perturbation assumptions.
Authors
Emil Cătinaş
(Tiberiu Popoviciu Institute of Numerical Analysis, Romanian Academy)
Keywords
fixed point; successive approximations; nonlinear system of equations in Rn; inexact Newton method; perturbed Newton method; linear systems of equation in Rn; residual; local convergence; q-convergence orders.
Cite this paper as:
E. Cătinaş, On the superlinear convergence of the successive approximations method, J. Optim. Theory Appl., 113 (2002) no. 3, pp. 473-485
doi: 10.1023/A:1015304720071
About this paper
Publisher Name
Kluwer Academic Publishers-Plenum Publishers
Print ISSN
0022-3239
Online ISSN
1573-2878
MR
0022-3239
Online ISSN
1573-2878
Google Scholar citations
[1] Ortega, J. M., and Rheinboldt, W. C., Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, NY, 1970. 484 JOTA: VOL. 113, NO. 3, JUNE 2002
[2] Potra, F. A., On Q-Order and R-Order of Convergence, Journal of Optimization Theory and Applications, Vol. 63, pp. 415–431, 1989.
[3] Rheinboldt, W. C., Methods for Solûing Systems of Nonlinear Equations, SIAM, Philadelphia, Pennsylvania, 1998.
[4] Potra, F. A., Q-Superlinear Convergence of the Iterates in Primal–Dual InteriorPoint Methods, Mathematical Programming (to appear).
[5] Ostrowski, A. M., Solution of Equations and Systems of Equations, Academic Press, New York, NY, 1966.
[6] Householder, A. S., The Theory of Matrices in Numerical Analysis, Dover, New York, NY, 1974.
[7] Argyros, I., and Szidarovszky, F., The Theory and Applications of Iteration Methods, CRC Press, Boca Raton, Florida, 1993.
[8] Argyros, I., On the Convergence of the Modified Contractions, Journal of Computational and Applied Mathematics, Vol. 55, pp. 183–189, 1994.
[9] Brown, P. N., A Local Convergence Theory for Combined Inexact-Newton Finite-Difference Projection Methods, SIAM Journal on Numerical Analysis, Vol. 24, pp. 407–434, 1987.
[10] Brown, P. N., and Saad, Y., Convergence Theory of Nonlinear Newton–Krylov Algorithms, SIAM Journal on Optimization, Vol. 4, pp. 297–330, 1994.
[11] Catinas, E., Newton and Newton–Krylov Methods for Solving Nonlinear Systems in n , PhD Thesis, Babes–Bolyai University of Cluj–Napoca, Cluj– Napoca, Romania, 1999.
[12] Catinas, E., On the High Convergence Orders of the Newton–GMBACK Methods, Revue d’Analyse Numerique et de Theorie de l’Approximation, Vol. 28, pp. 125–132, 1999.
[13] Catinas, E., A Note on the Quadratic Convergence of the Inexact Newton Methods, Revue d’Analyse Numerique et de Theorie de l’Approximation, Vol. 29, pp. 129–134, 2000.
[14] Catinas, E., Inexact Perturbed Newton Methods and Applications to a Class of Krylov Solvers, Journal of Optimization Theory and Applications, Vol. 108, pp. 543–570, 2001.
[15] Catinas, E., The Inexact, Inexact Perturbed, and Quasi-Newton Methods are Equivalent Models, Mathematics of Computation (to appear).
[16] Dembo, R. S., Eisenstat, S. C., and Steihaug, T., Inexact Newton Methods, SIAM Journal on Numerical Analysis, Vol. 19, pp. 400–408, 1982.
[17] DENNIS, J. E., JR., and MORE´ , J. J., A Characterization of Superlinear Convergence and Its Application to Quasi-Newton Methods, Mathematics of Computation, Vol. 28, pp. 549–560, 1974.
[18] Dennis, J. E., JR., and More , J. J., Quasi-Newton Methods, Motivation and Theory, SIAM Review, Vol. 19, pp. 46–89, 1977.
[19] Dennis, J. E., JR., and Schnabel, R. B., Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Series in Computational Mathematics, Prentice-Hall, Englewood Cliffs, New Jersey, 1983.
[20] Deuflhard, P., and Potra, F. A., Asymptotic Mesh Independence of Newton– Galerkin Methods via a Refined Mysovskii Theorem, SIAM Journal on Numerical Analysis, Vol. 29, pp. 1395–1412, 1992.
Paper (preprint) in HTML form
On the Superlinear Convergence of the Successive Approximations Method 1
Abstract
The Ostrowski theorem is a classical result which ensures the attraction of all the successive approximations near a fixed point . Different conditions [ultimately on the magnitude of ] provide lower bounds for the convergence order of the process as a whole. In this paper, we consider only one such sequence and we characterize its high convergence orders in terms of some spectral elements of ; we obtain that the set of trajectories with high convergence orders is restricted to some affine subspaces, regardless of the nonlinearity of . We analyze also the stability of the successive approximations under perturbation assumptions.
Key Words. Successive approximations, convergence orders, inexact Newton iterates.
1. Introduction
Consider a subset and a mapping : which has a fixed point ,
We are interested in the convergence to of the successive approximations , given for some by
| (1) |
First, we recall briefly the definitions of the convergence orders. The symbol stands for a given norm in and for its induced operator norm.
2 Research Fellow, T. Popoviciu Institute of Numerical Analysis, Romanian Academy of Sciences, Cluj-Napoca, Romania.
Definition 1.1. See Ref. 1, Chapter 9. Let be an arbitrary sequence converging to some . For each , the quotient factor and the root convergence factor are defined by
The -convergence and -convergence orders are defined by
Remark 1.1. When , it is said that the sequence converges -superlinearly; this may be written as
When for some , one may write
We recall also that -convergence with a certain order implies -convergence with at least the same order, the converse being false; for related results, we refer the reader to Ref. 1, Chapter 9 and Ref. 2; see also Ref. 3, Chapter 3 and Ref. 4.
When considering a whole iterative process, its convergence order measures the worst convergence among the sequences with the same limit. In this paper, we shall deal with the conditions ensuring that is an attraction point; i.e., there exists an open ball with center at such that all the sequences given by (1), with the initial approximation from that ball, converge to . The set of all such sequences will be denoted by . The
-factor and -factor of the iterative process are then defined as
the convergence orders being defined in the same fashion as for a single sequence.
The following attraction theorem is well known; see also Ref. 3, Theorem 3.5.
Theorem 1.1. See Ostrowski (Ref. 5, Theorem 22.1) and see Ref. 1, Theorems 10.1.3 and 10.1.4. Assume that the mapping is differentiable at the fixed point . If the spectral radius of satisfies
then is an attraction point for the successive approximations. Moreover,
and if , then
The condition is sharp. See the following example.
Example 1.1. See Ref. 1, Exercise 10.1-2. For : ,
is an attraction point, while for
the same fixed point is no longer an attraction point; in both cases, .
It is worth noting that the -superlinear convergence of does not generally imply the -superlinear convergence; see also Ref. 3, p. 30.
Example 1.2. See Ref. 1, Exercise 10.1-6. For : ,
with , one obtains , but in any norm.
A sufficient condition for is that is an M-matrix (see Ref. 3, p. 30); i.e., there exists a norm in such that
[equivalently, for any eigenvalue of with , all Jordan blocks containing are one-dimensional; see e.g. Ref. 6, p. 46]. As a limiting situation, we are led to the following result, which was proved in a direct manner in Ref. 1 (see also Ref. 3, p. 30).
Theorem 1.2. See Ref. 1, Theorem 10.1.6. Under the assumptions of the Ostrowski theorem, if , then ; i.e. has -superlinear and -superlinear convergence.
The convergence orders in the above theorem are actually higher if is smoother; see also Ref. 3, Theorem 3.6.
Theorem 1.3. See Ref. 1, Theorem 10.1.7. Assume that the mapping is continuously differentiable on an open neighborhood of the fixed point . If and if is twice differentiable at , then
i.e the process has -convergence and -convergence orders of at least two. If additionally,
then the convergence orders are exactly equal to two,
Ortega and Rheinboldt have noticed that the conditions in the previous two results are not necessary.
Example 1.3. See Ref. 1, Exercise 10.1-12. For : ,
arbitrarily high -convergence orders may be attained at , even if .
The above results on sufficiency are global, in the sense that they provide lower bounds for the convergence orders of all the sequences from . However, it is possible for some sequences to exhibit higher convergence orders than the lowest bound ensured for some by or .
Example 1.4. Consider some and ,
with and . Then,
(a) for converges only linearly;
(b) for converges with -order .
The aim of this paper is to characterize the high convergence orders of the sequence (1). This will be done in Section 2, while in Section 3 we shall analyze the stability of these iterations under some perturbation assumptions.
2. Convergence Orders of the Successive Approximations
Given a subset and a nonlinear mapping , the Newton method for approximating a solution of the nonlinear system is given by
Several results have revealed the local convergence properties of this method and of other Newton-type iterations; see e.g. Refs. 1, 3, 5, and 736. We shall recall the results of Dembo, Eisenstat, and Steihaug on inexact Newton methods (Ref. 16), which will allow us to analyze the local behavior of the successive approximations.
Consider the following (standard) assumptions on :
(a) there exists such that ;
(b) the mapping is differentiable on an open neighborhood of , with continuous at ;
(c) the Jacobian is nonsingular.
The derivative is said to be Hölder continuous at with exponent , if there exist such that
Given an initial approximation , the inexact Newton (IN) method for approximating the solution is given by the following iterations:
For , until convergence do the following steps:
Step 1. Find such that .
Step 2. Set .
The residuals are the amounts by which the approximate solutions fail to satisfy the exact linear systems.
The following result was obtained in Ref. 16.
Theorem 2.1. See Ref. 16. Assume that satisfies the standard assumptions and that, for an initial approximation , the IN iterates converge to . Then, the convergence is -superlinear iff
Additionally, if is Hölder continuous at with exponent , then the convergence is with -order iff
while it has -order iff
We obtain the following result concerning the successive approximations.
Theorem 2.2. Assume that the mapping is differentiable on an open neighborhood of the fixed point , with continuous at and . Let be an initial approximation such that the sequence of successive approximations converges to . Then converges -superlinearly iff
| (2) |
Additionally, suppose that is Hölder continuous at with exponent , . Then converges with -order iff
| (3) |
while the convergence is with -order iff
Proof. The successive approximations may be regarded as IN iterates for solving
namely,
The standard assumptions on are obviously satisfied, the invertibility of being ensured by hypothesis . Next, the Hölder continuity of at implies the same property for ,
Denoting
the conclusions are now straightforward from the previous theorem.
Remark 2.1.
(a) The superlinear convergence of when , assured by Theorem 1.2, is retrieved under the hypotheses of this result:
(b) The conclusions of Theorem 1.3 may be obtained in the same fashion,
since the standard hypotheses on ensure the existence of such that (see Ref. 16)
(c) The same conclusions could be obtained in the above theorem by writing the successive approximations as either quasi-Newton iterates or inexact perturbed Newton iterates; see Refs. 11, 14, 15.
(d) Instead of , one may assume a more general condition, namely that is invertible [which holds iff has no eigenvalue equal to one] and that converges to .
(e) The Ostrowski theorem holds also in Banach spaces (see e.g. Ref. 37, Theorem 4C) as well as the corresponding characterizations for the IN iterations, so our result may be restated in this more general frame.
Theorem 2.2 characterizes the convergence orders of the successive approximations in terms of the iterates alone. In the following two results, we shall show that the convergence orders are intimately related to some spectral elements of .
Theorem 2.3. Under the assumptions of Theorem 2.2, the sequence converges -superlinearly if and only if has a zero eigenvalue and, starting from a certain step, the corrections are the corresponding eigenvectors,
| (4) |
Provided that is Hölder continuous at with exponent , the above condition characterizes in fact the -convergence orders .
Proof. The residuals of the IN iterates may be written as the sums of two terms,
so the sufficiency of condition (4) is obvious. For necessity, we notice that the residuals and their first terms converge to zero with rate at least as , which requires (4).
Remark 2.2.
(a) The above result implies that no -superlinear convergence to of any sequence of successive approximations may occur when all the eigenvalues of are nonzero.
(b) As Example 1.3 shows, may attain -superlinear convergence even when not all the nonzero vectors in are eigenvectors of the eigenvalue 0 , i.e., when has only the zero eigenvalue but is defective. However, in such a case, the set of the possible trajectories is restricted.
(c) We notice that, when , the eventual sequences with superlinear convergence are highly sensitive to perturbations, which is not good news for the floating-point arithmetic context. We shall analyze the convergence of the perturbed sequences in Section 3.
The other characterization may be stated in terms of errors.
Theorem 2.4. Under the assumptions of Theorem 2.2, converges -superlinearly if and only if has a zero eigenvalue and,
starting from a certain step, the errors are corresponding eigenvectors,
| (5) |
Provided that is Hölder continuous at with exponent , the above condition characterizes in fact the -convergence orders .
Proof. The sufficiency is again obvious. For necessity, using (4), we get that
and therefore we get the conclusions, since these constant terms have the limit zero.
It is interesting to note that when converges -superlinearly to and the zero eigenvalue of is simple, then its trajectory is restricted from a certain step to a line containing , regardless of the nonlinearity of ; when zero is a double eigenvalue, the trajectory is restricted from a certain step to a plane containing , etc. Theoretically, the trajectory may be arbitrary only when .
Consider now the affine mapping
and for some initial approximation the iterations
| (6) |
The condition in the Ostrowski theorem becomes necessary and sufficient for these iterates to converge for any initial approximation to the unique fixed point in ; see e.g. Ref. 1, Theorem 10.1.5. Our results can be refined in this case.
Theorem 2.5. If , then the sequence given by (6) converges to in less than steps, for any initial approximation . If , then converges -superlinearly if and only if there exists such that
| (7) |
in which case .
Proof. It is known that a matrix has a spectral radius zero iff it is nilpotent; i.e., there exists such that , in which case may be
taken smaller than (see Ref. 38, Problem 159). The relation
| (8) |
completes the proof of the first affirmation.
The relation (7) is obtained immediately from (4) and (8).
When the initial approximation is taken to be zero, we obtain the following result.
Corollary 2.1. If and , then the sequence given by (6) converges in a finite number of steps if and only if or there exists such that
It is worth noting that, in the affine case, the -superlinear convergence reduces to convergence in a finite number of steps, i.e., to convergence with infinite order.
3. Stability of the Successive Approximations
Assume that the evaluation of at each step is performed only approximately,
| (9) |
These iterates may be viewed again as IN iterations for solving
since
We obtain the following result.
Theorem 3.1. Assume that satisfies the assumptions of Theorem 2.2, and that the sequence (9) of perturbed successive approximations converges to . Then the convergence is -superlinear iff
Additionally, if is Hölder continuous at with exponent , the convergence is with -order iff
and with -order iff
with -order , as .
Remark 3.1. We notice that sufficient conditions for the high convergence orders of the perturbed successive approximations are obtained when and are eigenvectors corresponding to the eigenvalue 0 of and converges to zero with a certain speed. In such a case, the trajectory remains in the set of the high convergence trajectories corresponding to the unperturbed iterations.
4. Conclusions
The condition in the Ostrowski theorem ensures that the fixed point is an attraction point and yields the lowest -convergence order attained by all the sequences of successive approximations. Our results characterize the high -convergence orders of a single such sequence [which may be attained even when ], indicating the set of all possible trajectories with convergence orders up to two (in this sense, the study of the convergence orders higher than two may be a direction of future research). The Ostrowski theorem requires the knowledge of the spectral radius of , while ours require the spectral structure of .
The results obtained may be applied to the study of the iterative methods used in practice and which may be written as fixed-point problems with known mappings and ; they may be applied also to the study of the fixed-point problems in the more abstract setting of Banach spaces (differential and integral equations, dynamical systems, etc). Further, we shall study the existence and estimates for the radius of the attraction balls for the successive approximations with high convergence orders.
References
-
1.
Ortega, J. M., and Rheinboldt, W. C., Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, NY, 1970.
-
2.
Potra, F. A., On Q-Order and R-Order of Convergence, Journal of Optimization Theory and Applications, Vol. 63, pp. 415-431, 1989.
-
3.
Rheinboldt, W. C., Methods for Solving Systems of Nonlinear Equations, SIAM, Philadelphia, Pennsylvania, 1998.
-
4.
Potra, F. A., Q-Superlinear Convergence of the Iterates in Primal-Dual InteriorPoint Methods, Mathematical Programming (to appear).
-
5.
Ostrowski, A. M., Solution of Equations and Systems of Equations, Academic Press, New York, NY, 1966.
-
6.
Householder, A. S., The Theory of Matrices in Numerical Analysis, Dover, New York, NY, 1974.
-
7.
Argyros, I., and Szidarovszky, F., The Theory and Applications of Iteration Methods, CRC Press, Boca Raton, Florida, 1993.
-
8.
Argyros, I., On the Convergence of the Modified Contractions, Journal of Computational and Applied Mathematics, Vol. 55, pp. 183-189, 1994.
-
9.
Brown, P. N., A Local Convergence Theory for Combined Inexact-Newton/ Finite-Difference Projection Methods, SIAM Journal on Numerical Analysis, Vol. 24, pp. 407-434, 1987.
-
10.
Brown, P. N., and Saad, Y., Convergence Theory of Nonlinear Newton-Krylov Algorithms, SIAM Journal on Optimization, Vol. 4, pp. 297-330, 1994.
-
11.
Cătinaș, E., Newton and Newton-Krylov Methods for Solving Nonlinear Systems in Thesis, Babes-Bolyai University of Cluj-Napoca, ClujNapoca, Romania, 1999.
-
12.
Cătinas, E., On the High Convergence Orders of the Newton-GMBACK Methods, Revue d’Analyse Numérique et de Théorie de l’Approximation, Vol. 28, pp. 125-132, 1999.
-
13.
Cătinaş, E., A Note on the Quadratic Convergence of the Inexact Newton Methods, Revue d’Analyse Numérique et de Théorie de l’Approximation, Vol. 29, pp. 129-134, 2000.
-
14.
Cătinaș, E., Inexact Perturbed Newton Methods and Applications to a Class of Krylov Solvers, Journal of Optimization Theory and Applications, Vol. 108, pp. 543-570, 2001.
-
15.
Cătinaş, E., The Inexact, Inexact Perturbed, and Quasi-Newton Methods are Equivalent Models, Mathematics of Computation (to appear).
-
16.
Dembo, R. S., Eisenstat, S. C., and Steihaug, T., Inexact Newton Methods, SIAM Journal on Numerical Analysis, Vol. 19, pp. 400-408, 1982.
-
17.
Dennis, J. E., Jr., and Moré, J. J., A Characterization of Superlinear Convergence and Its Application to Quasi-Newton Methods, Mathematics of Computation, Vol. 28, pp. 549-560, 1974.
-
18.
Dennis, J. E., Jr., and Moré, J. J., Quasi-Newton Methods, Motivation and Theory, SIAM Review, Vol. 19, pp. 46-89, 1977.
-
19.
Dennis, J. E., Jr., and Schnabel, R. B., Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Series in Computational Mathematics, Prentice-Hall, Englewood Cliffs, New Jersey, 1983.
-
20.
Deuflhard, P., and Potra, F. A., Asymptotic Mesh Independence of NewtonGalerkin Methods via a Refined Mysovskii Theorem, SIAM Journal on Numerical Analysis, Vol. 29, pp. 1395-1412, 1992.
-
21.
Eisenstat, S. C., and Walker, H. F., Globally Convergent Inexact Newton Methods, SIAM Journal on Optimization, Vol. 4, pp. 393-422, 1994.
-
22.
Eisenstat, S. C., and Walker, H. F., Choosing the Forcing Terms in an Inexact Newton Method, SIAM Journal on Scientific Computing, Vol. 17, pp. 16-32, 1996.
-
23.
Istrătescu, V. I., Introduction to the Fixed-Point Theory, Editura Academiei RSR, Bucharest, Romania, 1973 (in Romanian).
-
24.
Kelley, C. T., Iterative Methods for Linear and Nonlinear Equations, SIAM, Philadelphia, Pennsylvania, 1995.
-
25.
Kelley, C. T., and Sachs, E. W., Truncated Newton Methods for Optimization with Inaccurate Functions and Gradients, Report CRSC-TR-99-20, Center for Research in Scientific Computation, North Carolina State University, 1999.
-
26.
Kerkhoven, T., and Saad, Y., On Acceleration Methods for Coupled Nonlinear Elliptic Systems, Numerische Mathematik, Vol. 60, pp. 525-548, 1992.
-
27.
Mărușter, Șt., Quasi-Nonexpansivity and Two Classical Methods for Solving Nonlinear Equations, Proceedings of the American Mathematical Society, Vol. 62, pp. 119-123, 1977.
-
28.
Mărușter, Șt.,Numerical Methods for Solving Nonlinear Equations, Editura Tehnică, Bucharest, Romania, 1981 (in Romanian).
-
29.
Păvăloiu, I., Introduction to the Theory of Approximating the Solutions of Equations, Editura Dacia, Cluj-Napoca, Romania, 1976 (in Romanian).
-
30.
Pãvăloiu, I., The Solving of Nonlinear Equations by Interpolation, Editura Dacia, Cluj-Napoca, Romania, 1981 (in Romanian).
-
31.
Potra, F. A., On the Numerical Stability of a Class of Iterative Processes, Itinerant Seminar of Functional Equations, Approximation and Convexity, Timişoara, Romania, pp. 51-54, 1980 (in Romanian).
-
32.
Potra, F. A., On a Class of Iterative Procedures for Solving Nonlinear Equations in Banach Spaces, Computational Mathematics, Banach Center Publications, PWN-Polish Scientific Publishers, Warsaw, Poland, Vol. 13, pp. 607-621, 1984.
-
33.
Potra, F. A., and Pták, V., Sharp Error Bounds for Newton’s Process, Numerische Mathematik, Vol. 34, pp. 63-72, 1980.
-
34.
Potra, F. A., and Pták, V., Nondiscrete Induction and Iterative Processes, Pitman, London, England, 1984.
-
35.
Walker, H. F., An Approach to Continuation Using Krylov Subspace Methods, Computational Science in the 21st Century, Edited by M. O. Bristeau, G. Etgen, W. Fitzgibbon, J. L. Lions, J. Periaux, and M. F. Wheeler, John Wiley and Sons, New York, NY, pp. 72-82, 1997.
-
36.
Wilkinson, J. H., The Algebraic Eigenvalue Problem, Clarendon Press, Oxford, England, 1965.
-
37.
Zeidler, E., Nonlinear Functional Analysis and Its Applications, I: Fixed-Point Theorems, Springer Verlag, New York, NY, 1986.
-
38.
Glazman, I. and Liubitch, Y., Analyse Linéaire dans les Espaces de Dimensions Finies, Manuel en Problèmes, Mir, Moscow, Russia, 1972.
