Abstract

Inexact Newton methods are variant of the Newton method in which each step satisfies only approximately the linear system [1]. The local convergence theory given by the authors of [1] and most of the results based on it consider the error terms as being provided only by the fact that the linear systems are not solved exactly. The few existing results for the general case (when some perturbed linear systems are considered, which in turn are not solved exactly) do not offer explicit formulas in terms of the perturbations and residuals.

We extend this local convergence theory to the general case, characterizing the rate of convergence (the q-convergence orders) in terms of the perturbations and residuals.

The Newton iterations are then analyzed when, at each step, an approximate solution of the linear system is determined by the following Krylov solvers based on backward error minimization properties: GMRES, GMBACK, MINPERT. We obtain results concerning the following topics: monotone properties of the errors in these Newton-Krylov iterates when the initial guess is taken 0 in the Krylov algorithms; control of the convergence orders of the Newton-Krylov iterations by the magnitude of the backward errors of the approximate steps; similarities of the asymptotical behavior of GMRES and MINPERT when used in a converging Newton method.

At the end of the paper, the theoretical results are verified on some numerical examples.

Authors

Emil Cătinaş
(Tiberiu Popoviciu Institute of Numerical Analysis, Romanian Academy)

Keywords

nonlinear system of equations in Rn; inexact Newton method; perturbed Newton method; Krylov methods; linear systems of equation in Rn; backward errors; GMRES; GMBACK; MINPERT; Newton-Krylov methods; residual; local convergence; q-convergence order.

Cite this paper as:

E. Cătinaş, Inexact perturbed Newton methods and applications to a class of Krylov solvers, J. Optim. Theory Appl., 108 (2001) no. 3, pp. 543-570.

PDF

About this paper

Publisher Name

Kluwer Academic Publishers-Plenum Publishers

Print ISSN

0022-3239

Online ISSN

1573-2878

MR

0022-3239

Online ISSN

1573-2878

Google Scholar citations

[1] Dembo, R. S., Eisenstat, S. C., and Steigaug, T., Inexact Newton Methods, SIAM Journal on Numerical Analysis, Vol. 19, pp. 400–408, 1982. 568 JOTA: VOL. 108, NO. 3, MARCH 2001

[2]  Ortega, J. M., and Rheinboldt, W. C., Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, NY, 1970.

[3]  Kantorovich, L. V., and Akrilov, G. P., Functional Analysis, Editura S¸tiint¸ificaˇ s¸i Enciclopedicaˇ, Bucharest, Romania, 1986 (in Romanian).

[4] Pavaloiu, I., Introduction to the Theory of Approximating the Solutions of Equations, Editura Dacia, Cluj–Napoca, Romania, 1976 (in Romanian).

[5] Maruster St., Numerical Methods for Solûing Nonlinear Equations, Editura Tehnicaˇ, Bucharest, Romania, 1981 (in Romanian).

[6]  Dennis, J. E., JR., and Schnabel, R. B., Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Series in Computational Mathematics, Prentice-Hall, Englewood Cliffs, New Jersey, 1983.

[7] Potra, F. A., and Ptak, V., Nondiscrete Induction and Iterative Processes, Pitman, London, England, 1984.

[8] Argyros, I., and Szidarovszky, F., The Theory and Applications of Iteration Methods, CRC Press, Boca Raton, Florida, 1993.

[9] Kelley, C. T., Iterative Methods for Linear and Nonlinear Equations, SIAM, Philadelphia, Pennsylvania, 1995.

[10] Rheinboldt, W. C., Methods for Solving Systems of Nonlinear Equations, SIAM, Philadelphia, Pennsylvania, 1998.

[11] Dennis, J. E., JR., and Walker, H. F., Inaccuracy in Quasi-Newton Methods: Local Improvement Theorems, Mathematical Programming Study, Vol. 22, pp. 70–85, 1984.

[12] Potra, F. A., and Ptak, K, V., Sharp Error Bounds for Newton’s Process, Numerische Mathematik, Vol. 34, pp. 63–72, 1980.

[13] Deuflhard, P., and Potra, F. A., Asymptotic Mesh Independence of Newton– Galerkin Methods via a Refined Mysovskii Theorem, SIAM Journal on Numerical Analysis, Vol. 29, pp. 1395–1412, 1992.

[14] Argyros, I., Concerning the Conûergence of Inexact Newton Methods, Journal of Computational and Applied Mathematics, Vol. 79, pp. 235–247, 1997.

[15] Argyros, I., On a New Newton–Mysoûskii-Type Theorem with Applications to Inexact Newton-Like Methods and Their Discretization, IMA Journal of Numerical Analysis, Vol. 18, pp. 37–56, 1998.

[16] Sherman, A. H., On Newton-Iterative Methods for the Solution of Systems of Nonlinear Equations, SIAM Journal on Numerical Analysis, Vol. 15, pp. 755– 771, 1978.

[17] Dennis, J. E., JR., and More , J. J., A Characterization of Superlinear Conûergence and Its Application to Quasi-Newton Methods, Mathematics of Computation, Vol. 28, pp. 549–560, 1974.

[18] Dennis, J. E., JR., and More, J. J., Quasi-Newton Methods, Motivation and Theory, SIAM Review, Vol. 19, pp. 46–89, 1977.

[19] Eisenstat, S. C., and Walker, H. F., Choosing the Forcing Terms in an Inexact Newton Method, SIAM Journal on Statistical and Scientific Computing, Vol. 17, pp. 16–32, 1996.

[20] Eisenstat, S. C., and Walker, H. F., Globally Convergent Inexact Newton Methods, SIAM Journal on Optimization, Vol. 4, pp. 393–422, 1994. JOTA: VOL. 108, NO. 3, MARCH 2001 569

[21] Ypma, T. J., The Effect of Rounding Errors on Newton-Like Methods, IMA Journal of Numerical Analysis, Vol. 3, pp. 109–118, 1983.

[22] Martinez, H. J., Parada, Z., and Tapia, R. A., On the Characterization of Q-Superlinear Convergence of Quasi-Newton Interior-Point Methods for Nonlinear Programming, Boletin de la Sociedad Matematica Mexicana, Vol. 1, pp. 137–148, 1995.

[23] Ypma, T. J., Local Convergence of Inexact Newton Methods, SIAM Journal on Numerical Analysis, Vol. 21, pp. 583–590, 1983.

[24] Cores, D., and Tapia, R. A., Perturbation Lemma for the Newton Method with Application to the SQP Newton Method, Journal of Optimization Theory and Applications, Vol. 97, pp. 271–280, 1998

[25] Potra, F. A., On Q-Order and R-Order of Convergence, Journal of Optimization Theory and Applications, Vol. 63, pp. 415–431, 1989.

[26] Brown, P. N., A Local Convergence Theory for Combined Inexact-Newton Finite-Difference Projection Methods, SIAM Journal on Numerical Analysis, Vol. 24, pp. 407–434, 1987.

[27] Catinas, E., A Note on Inexact Secant Methods, Revue d’Analyse Numerique et de Theorie de l’Approximation, Vol. 25, pp. 33–41, 1996.

[28] Krolikowska, M., Nonlinear Mappings with an Almost Sparse Jacobian Matrix, Annales Universitatis Mariae Curie–Skåodowska, Lublin, Poland, Vol. 49A, pp. 145–157, 1995.

[29] Rigal, J. L., and Gaches, J., On the Compatibility of a Given Solution with the Data of a Linear System, Journal of the Association for Computing Machinery, Vol. 14, pp. 543–548, 1967.

[30] Saad, Y., and Schultz, M. H., GMRES: A Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems, SIAM Journal on Scientific and Statistical Computing, Vol. 7, pp. 856–869, 1986.

[31] Saad, Y., Krylov, Subspace Methods for Solving Large Unsymmetric Linear Systems, Mathematics of Computation, Vol. 37, pp. 105–126, 1981.

[32] Kasenally, E. M., GMBACK: A Generalized Minimum Backward Error Algorithm for Nonsymmetric Linear Systems, SIAM Journal on Statistical and Scientific Computing, Vol. 16, pp. 698–719, 1995.

[33] Saad, Y., Iterative Methods for Sparse Linear Systems, PWS Publishing Company, Boston, Massachusetts, 1996.

[34] Brown, P. N., and SAAD, Y., Hybrid Krylov Methods for Nonlinear Systems of Equations, SIAM Journal on Scientific and Statistical Computing, Vol. 11, pp. 450–481, 1990.

[35] Turner, K., and Walker, H. F., Efficient High Accuracy Solutions with GMRES(m), SIAM Journal on Scientific and Statistical Computing, Vol. 13, pp. 815–825, 1992.

[36] Brown, P. N., and Hindmarsh, A. C., Reduced Storage Matrix Methods in Stiff ODE Systems, Applied Mathematics and Computation, Vol. 31, pp. 40– 91, 1989.

[37] Greenbaum, A., Ptak, V., and Strakos, Z., Any Nonincreasing Convergence Curve is Possible for GMRES, SIAM Journal on Matrix Analysis and Applications, Vol. 17, pp. 465–469, 1996. 570 JOTA: VOL. 108, NO. 3, MARCH 2001

[38] Brown, P. N., and Saad, Y., Convergence Theory of Nonlinear Newton–Krylov Algorithms, SIAM Journal on Optimization, Vol. 4, pp. 297–330, 1994.

[39] Brown, P. N., A Theoretical Comparison of the Arnoldi and GMRES Algorithms, SIAM Journal on Scientific and Statistical Computing, Vol. 12, pp. 58– 78, 1991.

[40] Kasenally, E. M., and Simoncini, V., Analysis of a Minimum Perturbation Algorithm for Nonsymmetric Linear Systems, SIAM Journal on Numerical Analysis, Vol. 34, pp. 48–66, 1997.

[41] Stewart, G. W., and Sun, J. G., Matrix Perturbation Theory, Academic Press, New York, NY, 1990.

[42] More, J. J., A Collection of Nonlinear Model Problems, Computational Solutions of Nonlinear Systems of Equations, Lectures in Applied Mathematics, Edited by E. L. Allgower and K. Georg, American Mathematical Society, Providence, Rhode Island, Vol. 26, pp. 723–762, 1990.

[43] Glowinski, R., Keller, H. B., and Reinhart, L., Continuation-Conjugate Gradient Methods for the Least-Squares Solution of Nonlinear Boundary-Value Problems, SIAM Journal on Scientific and Statistical Computing, Vol. 6, pp. 793–832, 1985.

[44] Catinas, E., Newton and Newton–Krylov Methods for Solving Nonlinear Systems in n , PhD Thesis, Babes–Bolyai University of Cluj–Napoca, Cluj–Napoca, Romania, 1999.

Paper (preprint) in HTML form

Inexact Perturbed Newton Methods and Applications to a Class of Krylov Solvers 1

E. Cătinaş 2
Communicated by F. A. Potra
Abstract

Inexact Newton methods are variant of the Newton method in which each step satisfies only approximately the linear system (Ref. 1 ). The local convergence theory given by the authors of Ref. 1 and most of the results based on it consider the error terms as being provided only by the fact that the linear systems are not solved exactly. The few existing results for the general case (when some perturbed linear systems are considered, which in turn are not solved exactly) do not offer explicit formulas in terms of the perturbations and residuals. We extend this local convergence theory to the general case, characterizing the rate of convergence in terms of the perturbations and residuals.
The Newton iterations are then analyzed when, at each step, an approximate solution of the linear system is determined by the following Krylov solvers based on backward error minimization properties: Gmres, Gmback, Minpert. We obtain results concerning the following topics: monotone properties of the errors in these Newton-Krylov iterates when the initial guess is taken 0 in the Krylov algorithms; control of the convergence orders of the Newton-Krylov iterations by the magnitude of the backward errors of the approximate steps; similarities of the asymptotical behavior of Gmres and Minpert when used in a converging Newton method. At the end of the paper, the theoretical results are verified on some numerical examples.

Key Words. Nonlinear systems, (inexact) Newton methods, (inexact) perturbed Newton methods, convergence orders, linear systems, backward errors, Krylov methods (Gmres, Gmback, Minpert), NewtonKrylov methods.

00footnotetext: 1 This research was supported by the Romanian Academy of Sciences under Grant GAR 95/ 1998. The support and guidance of Dr. I. Pǎvǎloiu, Head of the Numerical Analysis Institute, proved very important to the progress of the research. The author is also grateful to two anonymous referees for their careful reading of previous versions of this paper and helpful criticism and suggestions.
2 Research Fellow, T. Popoviciu Institute of Numerical Analysis, Romanian Academy of Sciences, Cluj-Napoca, Romania.

1. Introduction

Consider the system of nonlinear equations

F(x)=0F(x)=0 (1)

where F:NNF:\mathbb{R}^{N}\rightarrow\mathbb{R}^{N} is a nonlinear mapping and suppose that:
(C1) There exists xNx^{*}\in\mathbb{R}^{N} such that F(x)=0F\left(x^{*}\right)=0.
A classical approach for approximating xx^{*} is the Newton method, which results in the following algorithm:
(NM) Choose an initial approximation x0Nx_{0}\in\mathbb{R}^{N}.
For k=0,1,k=0,1,\ldots until convergence, do the following steps.
Step 1. Solve F(xk)sk=F(xk)F^{\prime}\left(x_{k}\right)s_{k}=-F\left(x_{k}\right).
Step 2. Set xk+1=xk+skx_{k+1}=x_{k}+s_{k}.
At each iteration of the Newton method, a routine for solving the resulting linear system must be called. The direct methods for linear systems cannot offer always the exact solution when used in floating point arithmetic and they may be inefficient for large general systems. Usually, some of the iterative methods cannot offer the exact solution after a finite number of steps even in exact arithmetic. Another fact is that, when xkx_{k} is far from xx^{*}, it may be worthless to solve exactly the system. These reasons require a convergence analysis which takes into account some error terms.

Several Newton-type methods have been studied. Some sufficient conditions for different convergence orders have been given in Refs. 2-16 and references therein. A characterization of superlinear convergence and of convergence with orders 1+p,p(0,1]1+p,p\in(0,1], has been obtained by Dennis and Moré in Refs. 17-18 for sequences given by the following quasi-Newton method:

xk+1=xkBk1F(xk),k=0,1,,x0N,x_{k+1}=x_{k}-B_{k}^{-1}F\left(x_{k}\right),\quad k=0,1,\ldots,\quad x_{0}\in\mathbb{R}^{N},

(Bk)k0N×N\left(B_{k}\right)_{k\geq 0}\subset\mathbb{R}^{N\times N} being a sequence of invertible matrices.
A characterization of local superlinear convergence and local convergence with orders 1+p,p(0,1]1+p,p\in(0,1], of the Newton methods which take into account other error terms has been given by Dembo, Eisenstat, and Steihaug (Ref. 1). They considered in their paper the following inexact Newton method:
(INM) Choose an initial approximation x0Nx_{0}\in\mathbb{R}^{N}. For k=0,1,k=0,1,\ldots, until convergence, do the following steps.

Step 1. Find sks_{k} such that F(xk)sk=F(xk)+rkF^{\prime}\left(x_{k}\right)s_{k}=-F\left(x_{k}\right)+r_{k}.
Step 2. Set xk+1=xk+skx_{k+1}=x_{k}+s_{k}.

The error terms (residuals) rkr_{k} represent the amounts by which the solutions sks_{k}, determined in an unspecified manner, fail to satisfy the exact systems (NM). Their magnitudes are determined by the relative residuals rk/F(xk)\left\|r_{k}\right\|/\left\|F\left(x_{k}\right)\right\|, supposed to be bounded by the forcing sequence (ηk)k0\left(\eta_{k}\right)_{k\geq 0},

rk/F(xk)ηk,k=0,1,\left\|r_{k}\right\|/\left\|F\left(x_{k}\right)\right\|\leq\eta_{k},\quad k=0,1,\ldots (2)

The local convergence analysis given in Ref. 1 characterizes the convergence orders of (xk)k0\left(x_{k}\right)_{k\geq 0} given by the (INM) in terms of the magnitudes of rkr_{k}; see also Refs. 19-20 for other convergence results. However, in this paper as well as in most other papers using these results, the error terms rkr_{k} are considered to appear only because the exact Newton systems (NM) are solved approximately. But in many situations, it is hard to find the exact value of F(x)F^{\prime}(x), and in some cases even of F(x)F(x). On the other hand, F(x)F^{\prime}(x) or its approximation and F(x)F(x) are both altered when represented in floating point arithmetic.

The question that arises naturally is: What magnitudes can we allow in perturbing the matrices F(xk)F^{\prime}\left(x_{k}\right) and the vectors F(xk)-F\left(x_{k}\right) so that the convergence order of the resulting method does not decrease?

The Newton methods with perturbed linear systems have been considered by several authors (see for instance Refs. 9, 11, 21, and references therein), but these methods were not analyzed with respect to their convergence orders.

Martinez, Parada, and Tapia (Ref. 22) have analyzed the superlinear convergence of sequences given by the following damped and perturbed quasi-Newton method:

xk+1=xkαkBk1[F(xk)+rk],x_{k+1}=x_{k}-\alpha_{k}B_{k}^{-1}\left[F\left(x_{k}\right)+r_{k}\right],

where 0<αk1,rkN,k=0,1,0<\alpha_{k}\leq 1,r_{k}\in\mathbb{R}^{N},k=0,1,\ldots, and x0Nx_{0}\in\mathbb{R}^{N}; but their superlinear convergence was not characterized simultaneously in terms of αk,Bk,rk\alpha_{k},B_{k},r_{k}.

In Ref. 23, Ypma studied the case when the matrices F(xk)F^{\prime}\left(x_{k}\right) and the vectors F(xk)-F\left(x_{k}\right) are calculated approximately, the resulting linear systems being solved inexactly. However, the residuals were considered to be incorporated in those perturbed linear systems. Consequently, the obtained convergence results do not offer explicit formulas in terms of the magnitude of the perturbations and residuals. Recently, Cores and Tapia (Ref. 24) have considered the exact solving of perturbed linear systems, but they have obtained only sufficient conditions on the perturbations for different convergence orders to be attained.

We consider here the perturbations (Δk)k0N×N\left(\Delta_{k}\right)_{k\geq 0}\subset\mathbb{R}^{N\times N} and (δk)k0N\left(\delta_{k}\right)_{k\geq 0}\subset\mathbb{R}^{N}, being led to the study of the following inexact perturbed Newton method:

 (IPNM) [F(xk)+Δk]sk=[F(xk)+δk]+r^k,xk+1=xk+sk,k=0,1,,x0N.\begin{array}[]{ll}\text{ (IPNM) }&{\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]s_{k}=\left[-F\left(x_{k}\right)+\delta_{k}\right]+\hat{r}_{k},}\\ &x_{k+1}=x_{k}+s_{k},\quad k=0,1,\ldots,\quad x_{0}\in\mathbb{R}^{N}.\end{array}

The terms r^k\hat{r}_{k} denote the residuals of the approximate solutions sks_{k} of the linear systems

[F(xk)+Δk]s=F(xk)+δk.\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]s=-F\left(x_{k}\right)+\delta_{k}.

When these systems are assumed to be solved exactly [i.e., r^k=0,k=0,1,\hat{r}_{k}=0,k=0,1,\ldots, in the (IPNM)], we call the resulting method a perturbed Newton method,

 (PNM) [F(xk)+Δk]sk=F(xk)+δk\text{ (PNM) }\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]s_{k}=-F\left(x_{k}\right)+\delta_{k}\text{. }

This frame will be useful for the study of some Newton-Krylov methods in connection with backward errors.

The paper is structured as follows. In Section 2, we give two convergence results for the (IPNM). In Section 3, we characterize the superlinear convergence and the convergence with orders 1+p,p(0,1]1+p,p\in(0,1], of the (IPNM), and we also provide some sufficient conditions. In Section 4, we analyze the Newton methods in which the linear systems from each step are solved by Gmres, Gmback, or Minpert. We verify the obtained theoretical results on numerical examples performed on two test problems (Section 5), and we end the paper with concluding remarks.

Throughout the paper, we consider the Euclidean norm (denoted by 2\|\cdot\|_{2} ) and an arbitrary given norm on N\mathbb{R}^{N} (denoted by \|\cdot\| ), together with their induced operator norms. We use also the Frobenius norm of a matrix ZM×NZ\in\mathbb{R}^{M\times N}, defined as

ZF=tr(ZZt)\|Z\|_{F}=\sqrt{\operatorname{tr}\left(ZZ^{t}\right)}

We use the column notation for vectors; when we write vectors (or matrices) inside square brackets, we consider the matrix containing on columns those vectors (or the columns of those matrices). The same convention is used when joining rows. As usual, the vectors ei,i=1,,ne_{i},i=1,\ldots,n, form the standard basis in n,n\mathbb{R}^{n},n being clear from the context.

2. Convergence Results for Inexact Perturbed Newton Methods

The common conditions for the local convergence analysis of the Newton method are Condition (C1) and the following additional conditions:
(C2) The mapping FF is differentiable on a neighborhood of xx^{*} and FF^{\prime} is continuous at xx^{*}.
(C3) The Jacobian F(x)F^{\prime}\left(x^{*}\right) is nonsingular.
These conditions ensure that xx^{*} is a point of attraction for the Newton method; i.e., there exists ϵ>0\epsilon>0 such that (xk)k0\left(x_{k}\right)_{k\geq 0} given by the Newton method converges to xx^{*} for any initial approximation x0Nx_{0}\in\mathbb{R}^{N} with x0x<ϵ\left\|x_{0}-x^{*}\right\|<\epsilon.

Moreover, the convergence is qq-superlinear 3; see Ref. 2, Theorem 10.2.2 and Ref. 10, Theorem 4.4.

In the convergence analysis of the (IPNM) iterations, we assume that:
(C4) The perturbations Δk\Delta_{k} are such that the matrices F(xk)+ΔkF^{\prime}\left(x_{k}\right)+\Delta_{k} are nonsingular for k=0,1,k=0,1,\ldots.

Though different in notation, in fact this condition is similar to the one considered for the quasi-Newton iterates, which requires a sequence of invertible matrices (Bk)k0\left(B_{k}\right)_{k\geq 0}.

The following sufficient condition for the convergence of the (INM) was proved by Dembo, Eisenstat, and Steihaug (Ref. 1).

Theorem 2.1. See Ref. 1. Assume Conditions (C1)-(C3) and ηkηmax <t<1,k=0,1,\eta_{k}\leq\eta_{\text{max }}<t<1,k=0,1,\ldots. There exists ϵ>0\boldsymbol{\epsilon}>0 such that, if x0xϵ\left\|x_{0}-x^{*}\right\|\leq\boldsymbol{\epsilon}, then the sequence of the (INM) iterates (xk)k0\left(x_{k}\right)_{k\geq 0} satisfying (2) converges to xx^{*}. Moreover, the convergence is linear, in the sense that

xk+1xtxkx,k=0,1,,\left\|x_{k+1}-x^{*}\right\|_{*}\leq t\left\|x_{k}-x^{*}\right\|_{*},\quad k=0,1,\ldots,

where y=F(x)y\|y\|_{*}=\left\|F^{\prime}\left(x^{*}\right)y\right\|.
Using this theorem, we obtain the following convergence results for the (IPNM).

Theorem 2.2. Assume Conditions (C1)-(C4) and ηkηmax<t<1\eta_{k}\leq\eta_{\max}<t<1, k=0,1,k=0,1,\ldots. There exists ϵ>0\boldsymbol{\epsilon}>0 such that, if x0xϵ\left\|x_{0}-x^{*}\right\|\leq\epsilon and Δk[F(xk)+Δk]1F(xk)+{IΔk[F(xk)+Δk]1}(δk+r^k)ηkF(xk)\left\|\Delta_{k}\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}F\left(x_{k}\right)+\left\{I-\Delta_{k}\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}\right\}\left(\delta_{k}+\hat{r}_{k}\right)\right\|\leq\eta_{k}\left\|F\left(x_{k}\right)\right\|, for k=0,1,k=0,1,\ldots, then the sequence of the (IPNM) iterates (xk)k0\left(x_{k}\right)_{k\geq 0} converges to xx^{*}, the convergence being linear,

xk+1xtxkx,k=0,1,\left\|x_{k+1}-x^{*}\right\|_{*}\leq t\left\|x_{k}-x^{*}\right\|_{*},\quad k=0,1,\ldots

Proof. The (IPNM) can be viewed as an (INM) with

sk=[F(xk)+Δk]1F(xk)+[F(xk)+Δk]1(δk+r^k),\displaystyle s_{k}=-\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}F\left(x_{k}\right)+\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}\left(\delta_{k}+\hat{r}_{k}\right),
F(xk)sk=ΔkskF(xk)+δk+r^k\displaystyle F^{\prime}\left(x_{k}\right)s_{k}=-\Delta_{k}s_{k}-F\left(x_{k}\right)+\delta_{k}+\hat{r}_{k}
=F(xk)+Δk[F(xk)+Δk]1F(xk)\displaystyle=-F\left(x_{k}\right)+\Delta_{k}\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}F\left(x_{k}\right)
Δk[F(xk)+Δk]1(δk+r^k)+δk+r^k\displaystyle-\Delta_{k}\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}\left(\delta_{k}+\hat{r}_{k}\right)+\delta_{k}+\hat{r}_{k}
=F(xk)+Δk[F(xk)+Δk]1F(xk)\displaystyle=-F\left(x_{k}\right)+\Delta_{k}\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}F\left(x_{k}\right)
+{IΔk[F(xk)+Δk]1}(δk+r^k).\displaystyle+\left\{I-\Delta_{k}\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}\right\}\left(\delta_{k}+\hat{r}_{k}\right).
00footnotetext: 3 For definitions and results concerning convergence orders, see Ref. 2, Chapter 9 and also Refs. 10, 25.

Denoting

rk=Δk[F(xk)+Δk]1F(xk)+{IΔk[F(xk)+Δk]1}(δk+r^k)r_{k}=\Delta_{k}\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}F\left(x_{k}\right)+\left\{I-\Delta_{k}\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}\right\}\left(\delta_{k}+\hat{r}_{k}\right) (3)

the conclusion follows from Theorem 2.1.

Corollary 2.1. Assume Conditions (C1)-(C4). There exists ϵ>0\boldsymbol{\epsilon}>0 such that, if x0xϵ\left\|x_{0}-x^{*}\right\|\leq\epsilon and

Δk[F(xk)+Δk]1q1<1,k=0,1,,\displaystyle\left\|\Delta_{k}\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}\right\|\leq q_{1}<1,\quad k=0,1,\ldots,
δk+r^k[ηk/(1+q1)]F(xk)\displaystyle\left\|\delta_{k}\right\|+\left\|\hat{r}_{k}\right\|\leq\left[\eta_{k}/\left(1+q_{1}\right)\right]\left\|F\left(x_{k}\right)\right\|
where ηkq2<1q1,k=0,1,,\displaystyle\text{ where }\eta_{k}\leq q_{2}<1-q_{1},\quad k=0,1,\ldots,

then the sequence of the (IPNM) iterates (xk)k0\left(x_{k}\right)_{k\geq 0} converges to xx^{*}. Moreover, the convergence is linear,

xk+1xtxkx,k=0,1,,\left\|x_{k+1}-x^{*}\right\|_{*}\leq t\left\|x_{k}-x^{*}\right\|_{*},\quad k=0,1,\ldots,

where t=q1+q2t=q_{1}+q_{2}.
Proof. The proof is obtained easily from the previous result making use of the hypotheses.

Remark 2.1. The idea of reducing certain perturbed Newton methods to (INM) iterations, which is used in the proof of Theorem 2.2, can be found in the work of several authors; see for example Refs. 9-10 and 26. The inexact secant methods considered by us in Ref. 27 are in fact instances of the (IPNM) model, but it was Ref. 28 that inspired us to consider the (IPNM).

3. Convergence Orders of Inexact Perturbed Newton Methods

Stronger conditions imposed on the continuity of FF^{\prime} at xx^{*} offer higher convergence orders for the Newton method. Namely, if FF^{\prime} is Hölder continuous at xx^{*} with exponent p,p(0,1]p,p\in(0,1], i.e., if there exist ϵ>0\epsilon>0 and L0L\geq 0 such that

F(x)F(x)Lxxp, for xxϵ,\left\|F^{\prime}(x)-F^{\prime}\left(x^{*}\right)\right\|\leq L\left\|x-x^{*}\right\|^{p},\quad\text{ for }\left\|x-x^{*}\right\|\leq\boldsymbol{\epsilon},

then the Newton method converges locally with qq-order at least 1+p1+p; see Ref. 2, Theorem 10.2.2 and Ref. 10, Theorem 4.4.

For the inexact Newton methods, Dembo, Eisenstat, and Steihaug proved the following result.

Theorem 3.1. See Ref. 1. Assume that Conditions (C1)-(C3) hold and that the inexact Newton iterates (xk)k0\left(x_{k}\right)_{k\geq 0} converge to xx^{*}. Then, xkxqx_{k}\rightarrow x^{*}q-superlinearly if and only if

rk=o(F(xk)), as k.\left\|r_{k}\right\|=o\left(\left\|F\left(x_{k}\right)\right\|\right),\quad\text{ as }k\rightarrow\infty.

Moreover, if FF^{\prime} is Hölder continuous at xx^{*} with exponent p,p(0,1]p,p\in(0,1], then xkxx_{k}\rightarrow x^{*} with qq-order at least 1+p1+p if and only if

rk=𝒪(F(xk)1+p), as k.\left\|r_{k}\right\|=\mathscr{O}\left(\left\|F\left(x_{k}\right)\right\|^{1+p}\right),\quad\text{ as }k\rightarrow\infty.

We obtain the following result for the inexact perturbed Newton methods.

Theorem 3.2. Assume that Conditions (C1)-(C4) hold and that the iterates (xk)k0\left(x_{k}\right)_{k\geq 0} given by the (IPNM) converge to xx^{*}. Then, xkxqx_{k}\rightarrow x^{*}q-superlinearly if and only if
Δk[F(xk)+Δk]1F(xk)+{IΔk[F(xk)+Δk]1}(δk+r^k)=o(F(xk))\left\|\Delta_{k}\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}F\left(x_{k}\right)+\left\{I-\Delta_{k}\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}\right\}\left(\delta_{k}+\hat{r}_{k}\right)\right\|=o\left(\left\|F\left(x_{k}\right)\right\|\right), as kk\rightarrow\infty. Moreover, if FF^{\prime} is Hölder continuous at xx^{*} with exponent pp, then xkxx_{k}\rightarrow x^{*} with qq-order at least 1+p1+p if and only if
Δk[F(xk)+Δk]1F(xk)+{IΔk[F(xk)+Δk]1}(δk+r^k)=O(F(xk)1+p)\left\|\Delta_{k}\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}F\left(x_{k}\right)+\left\{I-\Delta_{k}\left[F^{\prime}\left(x_{k}\right)+\Delta_{k}\right]^{-1}\right\}\left(\delta_{k}+\hat{r}_{k}\right)\right\|=O\left(\left\|F\left(x_{k}\right)\right\|^{1+p}\right), as kk\rightarrow\infty.

Proof. The proof is obtained from the previous theorem by using (3).

In the following result, we characterize the convergence orders of the (IPNM) iterates in terms of the rate of convergence to zero of residuals and perturbations.

Corollary 3.1. Assume that:
(a) Conditions (C1)-(C4) hold;
(b) Δk0,δk0,r^k0\Delta_{k}\rightarrow 0,\delta_{k}\rightarrow 0,\hat{r}_{k}\rightarrow 0, as kk\rightarrow\infty;
(c) the sequence (xk)k0\left(x_{k}\right)_{k\geq 0} given by the (IPNM) converges to xx^{*}.

In addition, if

δk=o(F(xk)),r^k=o(F(xk)), as k,\left\|\delta_{k}\right\|=o\left(\left\|F\left(x_{k}\right)\right\|\right),\quad\left\|\hat{r}_{k}\right\|=o\left(\left\|F\left(x_{k}\right)\right\|\right),\quad\text{ as }k\rightarrow\infty,

then, xkxqx_{k}\rightarrow x^{*}q-superlinearly.

Under the same assumptions (a)-(c), if additionally FF^{\prime} is Hölder continuous at xx^{*} with exponent pp, and if

Δk=𝒪(F(xk)p),δk=𝒪(F(xk)1+p),\displaystyle\left\|\Delta_{k}\right\|=\mathscr{O}\left(\left\|F\left(x_{k}\right)\right\|^{p}\right),\quad\left\|\delta_{k}\right\|=\mathscr{O}\left(\left\|F\left(x_{k}\right)\right\|^{1+p}\right),
r^k=𝒪(F(xk)1+p), as k,\displaystyle\left\|\hat{r}_{k}\right\|=\mathscr{O}\left(\left\|F\left(x_{k}\right)\right\|^{1+p}\right),\quad\text{ as }k\rightarrow\infty,

then xkxx_{k}\rightarrow x^{*} with qq-order at least 1+p1+p.
The corresponding results for the rr-convergence orders of the (IPNM) iterates may be stated in a similar manner, taking into account the existing results for the (INM); see Ref. 1, Theorem 3.4 and Corollary 3.5.

4. Applications to Krylov Solvers Based on Backward Error Minimization Properties

Consider the nonsymmetric linear system

Ax=b,Ax=b, (4)

with AN×NA\in\mathbb{R}^{N\times N} nonsingular and bNb\in\mathbb{R}^{N}. The Krylov methods for solving such a system when the dimension NN is large are based on the Krylov subspaces, defined for any initial approximation x0Nx_{0}\in\mathbb{R}^{N} as

𝒦m=𝒦m(A,r0)=span{r0,Ar0,,Am1r0},\mathscr{K}_{m}=\mathscr{K}_{m}\left(A,r_{0}\right)=\operatorname{span}\left\{r_{0},Ar_{0},\ldots,A^{m-1}r_{0}\right\},

where

r0=bAx0r_{0}=b-Ax_{0}

is the initial residual and m{1,,N}m\in\{1,\ldots,N\}.
The normwise backward error of an approximate solution x~\tilde{x} of (4) was introduced by Rigal and Gaches (Ref. 29) and is defined by

Π(x~)=min{ϵ:(A+ΔA)x~=b+Δb,ΔAFϵEF,Δb2ϵf2},\Pi(\tilde{x})=\min\left\{\boldsymbol{\epsilon}:\left(A+\Delta_{A}\right)\tilde{x}=b+\Delta_{b},\left\|\Delta_{A}\right\|_{F}\leq\boldsymbol{\epsilon}\|E\|_{F},\left\|\Delta_{b}\right\|_{2}\leq\boldsymbol{\epsilon}\|f\|_{2}\right\},

where the parameters EN×NE\in\mathbb{R}^{N\times N} and fNf\in\mathbb{R}^{N} are arbitrary. The value of Π(x~)\Pi(\tilde{x}) is

Π(x~)=bAx~2/(EFx~2+f2),\Pi(\tilde{x})=\|b-A\tilde{x}\|_{2}/\left(\|E\|_{F}\cdot\|\tilde{x}\|_{2}+\|f\|_{2}\right),

and the minimum is attained by the backward errors
ΔA=[EFx~2/(EFx~2+f2)](bAx~)zt,\Delta_{A}=\left[\|E\|_{F}\cdot\|\tilde{x}\|_{2}/\left(\|E\|_{F}\cdot\|\tilde{x}\|_{2}+\|f\|_{2}\right)\right](b-A\tilde{x})z^{t},\quad with z=(1/x~22)x~z=\left(1/\|\tilde{x}\|_{2}^{2}\right)\tilde{x},
Δb=[f2/(EFx~2+f2)](bAx~)\Delta_{b}=-\left[\|f\|_{2}/\left(\|E\|_{F}\cdot\|\tilde{x}\|_{2}+\|f\|_{2}\right)\right](b-A\tilde{x}).

We shall analyze the behavior of the following three Krylov solvers when used for the linear systems arising in the Newton methods: Gmres, Gmback, Minpert. Each of these solvers is based on backward error minimization properties, as we shall see.

In order to use some existing notations, we prefer to denote hereafter by yy^{*} the solution of the nonlinear system (1) and by yky_{k} the iterates from the Newton methods, using xx in the involved linear systems.
4.1. Newton-Gmres Method. Given an initial approximation x0Nx_{0}\in\mathbb{R}^{N}, the Gmres method of Saad and Schultz (Ref. 30) uses the Arnoldi process (see Ref. 31) to construct an orthonormal basis {v1,,vm}\left\{v_{1},\ldots,v_{m}\right\} in the Krylov subspace 𝒦m\mathscr{K}_{m}. The approximation xmGMx0+𝒦mx_{m}^{\mathrm{GM}}\in x_{0}+\mathscr{K}_{m} is then determined such that

bAxmGM2=minxmx0+xmbAxm2\left\|b-Ax_{m}^{\mathrm{GM}}\right\|_{2}=\min_{x_{m}\in x_{0}+x_{m}}\left\|b-Ax_{m}\right\|_{2} (5)

Kasenally has noted in Ref. 32 that the minimizing property of xmGMx_{m}^{\mathrm{GM}} may be expressed in terms of the backward errors, namely,

minxmx0+xmbAxm2=minxmx0+xm{Δb2:Axm=bΔb};\min_{x_{m}\in x_{0}+x_{m}}\left\|b-Ax_{m}\right\|_{2}=\min_{x_{m}\in x_{0}+x_{m}}\left\{\left\|\Delta_{b}\right\|_{2}:Ax_{m}=b-\Delta_{b}\right\};

i.e, xmGMx_{m}^{\mathrm{GM}} minimizes over x0+𝒦mx_{0}+\mathscr{K}_{m} the backward error Δb\Delta_{b}, assuming ΔA=0\Delta_{A}=0.

The solution xmGMx_{m}^{\mathrm{GM}} is determined roughly in the following way (see also Ref. 33):
(i) Arnoldi Process. Determine Vm=[v1,,vm]N×mV_{m}=\left[v_{1},\ldots,v_{m}\right]\in\mathbb{R}^{N\times m} and the upper Hessenberg matrix H¯m(m+1)×m\bar{H}_{m}\in\mathbb{R}^{(m+1)\times m}.
(ii) Gmres Solution. Find the exact solution ymGMy_{m}^{\mathrm{GM}} of the following least squares problem in m\mathbb{R}^{m} :

minymmβe1H¯mym2\displaystyle\min_{y_{m}\in\mathbb{R}^{m}}\left\|\beta e_{1}-\bar{H}_{m}y_{m}\right\|_{2}
where β=r02\displaystyle\text{ where }\beta=\left\|r_{0}\right\|_{2}
Set xmGM=x0+VmymGM.\displaystyle\text{ Set }x_{m}^{\mathrm{GM}}=x_{0}+V_{m}y_{m}^{\mathrm{GM}}.

The Gmres method is used in an iterative fashion. Saad and Schultz proved that, at each step, the solution xmGM x_{m}^{\text{GM }} is uniquely determined and that the algorithm breaks down in the Arnoldi method only if the exact solution has been reached. Moreover, the process terminates in at most NN steps. In the results presented below, we shall assume, as usually, that the Krylov methods do not continue after the exact solution has been determined.

First, we introduce some notations for the restarted Gmres iterations in order to express the relations which they satisfy.

Since only small values of mm are attractive in practice, an upper bound m¯{1,,N1}\bar{m}\in\{1,\ldots,N-1\} is usually fixed for the subspace dimensions. If after m¯\bar{m} steps the computed solution does not have a sufficiently small residual, the Gmres method is restarted, taking for the initial approximation the last computed solution. We denote by xmGM(0),m{1,,m¯}x_{m}^{\mathrm{GM}(0)},m\in\{1,\ldots,\bar{m}\}, the first m¯\bar{m} solutions, by xmGM(1),m{1,,m¯}x_{m}^{\mathrm{GM}(1)},m\in\{1,\ldots,\bar{m}\}, the m¯\bar{m} solutions from the first restart, and so on. The value of the initial approximation x0x_{0} will be clear from the context. For the nonrestarted version, we shall use the common notations xmGMx_{m}^{\mathrm{GM}}, while for a generic Gmres solution 4, we shall simply write xGMx^{\mathrm{GM}}; the corresponding notations for the kk th correction from a Newton-Gmres method will be sk,mGM[resp.skGM]s_{k,m}^{\mathrm{GM}}\left[\mathrm{resp}.s_{k}^{\mathrm{GM}}\right].

The choice x0=0x_{0}=0 in the Krylov solvers is a popular one when no better guess is known; see e.g. Refs. 26 and 34-35. With the above notation, the following result is obtained immediately from the properties of the Gmres solutions. Certain affirmations from this result are more or less explicitly stated in some papers dealing with Gmres; see for example Refs. 26 and 36-37.

Proposition 4.1. Consider the linear system (4) and the initial approximation x0=0x_{0}=0. Then, the following statements are true:
(i) For any m{1,,N}m\in\{1,\ldots,N\}, the residual rmGMr_{m}^{\mathrm{GM}} of the Gmres solution satisfies
rmGM2b2\left\|r_{m}^{\mathrm{GM}}\right\|_{2}\leq\|b\|_{2}.
Moreover, the inequality is strict if and only if the solution xmGMx_{m}^{\mathrm{GM}} is nonzero.
(ii) The residuals associated to the NN successive solutions satisfy
0=rNGM2rN1GM2r1GM2b20=\left\|r_{N}^{\mathrm{GM}}\right\|_{2}\leq\left\|r_{N-1}^{\mathrm{GM}}\right\|_{2}\leq\cdots\leq\left\|r_{1}^{\mathrm{GM}}\right\|_{2}\leq\|b\|_{2}.
The inequality between the norms of two consecutive residuals is strict if and only if the corresponding Gmres solutions are distinct.
(iii) For any fixed upper bound m¯{1,,N1}\bar{m}\in\{1,\ldots,N-1\}, the residuals of the restarted Gmres method satisfy

\displaystyle\cdots r1GM(l+1)2rm¯GM(l)2r1GM(l)2\displaystyle\leq\left\|r_{1}^{\mathrm{GM}(l+1)}\right\|_{2}\leq\left\|r_{\bar{m}}^{\mathrm{GM}(l)}\right\|_{2}\leq\cdots\leq\left\|r_{1}^{\mathrm{GM}(l)}\right\|_{2}
rm¯GM(l1)2r1GM(0)2b2\displaystyle\leq\left\|r_{\bar{m}}^{\mathrm{GM}(l-1)}\right\|_{2}\leq\cdots\leq\left\|r_{1}^{\mathrm{GM}(0)}\right\|_{2}\leq\|b\|_{2}
00footnotetext: 4 In such a case, we assume that the initial approximation x0Nx_{0}\in\mathbb{R}^{N}, the upper bound m¯{1,,N1}\bar{m}\in\{1,\ldots,N-1\}, the number of (eventual) restarts l0l\geq 0, and the number of (final, if l1l\geq 1 ) iterations may be arbitrary.

The inequality between the norms of two consecutive residuals is strict if and only if the corresponding Gmres solutions are distinct; the residuals may eventually get to zero or may indefinitely stagnate, depending on the problem.

Considering the linear systems from the Newton method, we get easily the following proposition.

Proposition 4.2. Let yDy\in D be an element for which the derivative F(y)F^{\prime}(y) is nonsingular. Applying the Gmres method with x0=0x_{0}=0 to the linear system F(y)s=F(y)F^{\prime}(y)s=-F(y), then for all m{1,,N}m\in\{1,\ldots,N\}, the residual satisfies

rmGM2F(y)2.\left\|r_{m}^{\mathrm{GM}}\right\|_{2}\leq\|F(y)\|_{2}.

Moreover, the above inequality is strict if and only if the correction smGMs_{m}^{\mathrm{GM}} is nonzero.

We have chosen to state the result corresponding only to part (i) of Proposition 4.1. The other results are similarly enounced.

Brown (Ref. 26) and Brown and Saad (Ref. 38) considered solving (1) by global minimization,

minyNf(y)=minyNF(y)tF(y),\min_{y\in\mathbb{R}^{N}}f(y)=\min_{y\in\mathbb{R}^{N}}F(y)^{t}F(y),

when F:NNF:\mathbb{R}^{N}\rightarrow\mathbb{R}^{N}. They obtained that any nonzero correction from a certain step of the Newton-Gmres method with x0=0x_{0}=0 in Gmres is a descent direction for the above minimization problem. Now, we are able to describe this positive behavior in terms of the distance to the solution.

Theorem 4.1. Assume that the mapping FF satisfies Conditions (C1)-(C3), and consider a current approximation ycyy_{c}\neq y^{*} for yy^{*}. Let sGM s^{\text{GM }} be a nonzero approximate solution of the linear system F(yc)s=F(yc)F^{\prime}\left(y_{c}\right)s=-F\left(y_{c}\right) provided by Gmres, assuming the initial guess x0=0x_{0}=0. Let y+=yc+sGMy_{+}=y_{c}+s^{\mathrm{GM}}, denote η=rGM2/F(yc)2\eta=\left\|r^{\mathrm{GM}}\right\|_{2}/\left\|F\left(y_{c}\right)\right\|_{2}, and take t(η,1)t\in(\eta,1). If ycy_{c} is sufficiently close to yy^{*}, then

y+ytycy,\left\|y_{+}-y^{*}\right\|_{*}\leq t\left\|y_{c}-y^{*}\right\|_{*},

where y=F(y)y2\|y\|_{*}=\left\|F^{\prime}\left(y^{*}\right)y\right\|_{2}.
Proof. The thesis can be proved easily with slight modifications of the proof of Theorem 2.1, given in Ref. 1.

The above result says that, when using Gures (with x0=0x_{0}=0 ) in the Newton method, any nonzero correction improves the current outer
approximation, provided that the approximation is sufficiently good. Though this theorem guarantees a nonincreasing curve for the errors of the Newton-Gmres iterations starting from a sufficiently good approximation, the local convergence is not guaranteed. It is sufficient to notice that, considering a sequence obeying at each step the hypotheses of the above result, there may appear a situation when the set of forcing terms has 1 as accumulation point, in which case Theorem 2.1 cannot be applied.

The following theoretical example shows that, when the dimensions of the Krylov subspaces are smaller than NN, the outer iterations may stagnate at the initial approximation y0yy_{0}\neq y^{*}, no matter how close to yy^{*} we choose y0y_{0}.

Example 4.1. Consider

F:NN,F(y)=[y(N),y(1),y(2),,y(N1)]t,\displaystyle F:\mathbb{R}^{N}\rightarrow\mathbb{R}^{N},\quad F(y)=\left[y^{(N)},y^{(1)},y^{(2)},\ldots,y^{(N-1)}\right]^{t},
y=[y(1),,y(N)]tN,\displaystyle y=\left[y^{(1)},\ldots,y^{(N)}\right]^{t}\in\mathbb{R}^{N},

for which

F(y)[01110]=[e2,e3,,eN,e1]=A.F^{\prime}(y)\equiv\left[\begin{array}[]{cccc}0&&&1\\ 1&\ddots&&\\ &\ddots&\ddots&\\ &&1&0\end{array}\right]=\left[e_{2},e_{3},\ldots,e_{N},e_{1}\right]=A.

The Newton method for solving F(y)=0F(y)=0 should yield the unique solution y=0y^{*}=0 in one iteration for any y0Ny_{0}\in\mathbb{R}^{N}, whenever the corresponding linear system is solved exactly. However, when the correction is determined approximately, the situation changes, and we may use the (INM) setting.

Taking y0=heNy_{0}=-he_{N}, for some arbitrarily small h0h\neq 0, we must solve

F(y0)s=F(y0),F^{\prime}\left(y_{0}\right)s=-F\left(y_{0}\right),

i.e.,

Ax=b, with b=he1.Ax=b,\quad\text{ with }b=he_{1}.

Applying mN1m\leq N-1 steps of the Arnoldi process with x0=0x_{0}=0, we obtain Vm=[e1,,em]N×mV_{m}=\left[e_{1},\ldots,e_{m}\right]\in\mathbb{R}^{N\times m} and H¯m=[e2,,em+1](m+1)×m\bar{H}_{m}=\left[e_{2},\ldots,e_{m+1}\right]\in\mathbb{R}^{(m+1)\times m}.

The Gmres solution is given by (see Refs. 36 and 39)

xmGM=x0+Vm(H¯mtH¯m)1H¯mtVm+1tb,x_{m}^{\mathrm{GM}}=x_{0}+V_{m}\left(\bar{H}_{m}^{t}\bar{H}_{m}\right)^{-1}\bar{H}_{m}^{t}V_{m+1}^{t}b,

such that

Vm+1tb=he1,H¯mthe1=0,V_{m+1}^{t}b=he_{1},\quad\bar{H}_{m}^{t}he_{1}=0,

and so

xmGM=s0,mGM=0, for all m=1,,N1.x_{m}^{\mathrm{GM}}=s_{0,m}^{\mathrm{GM}}=0,\quad\text{ for all }m=1,\ldots,N-1.

We also note that the restarting version of the Gmres yields the same result whenever x0=0x_{0}=0 and m¯,mN1\bar{m},m\leq N-1.

This theoretical example shows that the dimension of the Krylov subspaces may sometimes be crucial for the efficiency of the Newton-Gmres method, and that the use of the restarted version of Gmres cannot overcome the difficulties. The stagnation of the Gmres algorithm was first reported in Ref. 36 (where AA was as above and bb stood for e1e_{1} ), but we are not aware of any extension to a (non)linear mapping in order to show such stagnation of the Newton-Gmres method.
4.2. Newton-Gmback Method. The Gmback algorithm for solving the linear system (4) was introduced by Kasenally in Ref. 32. Given x0Nx_{0}\in\mathbb{R}^{N}, it computes a vector xmGBx0+𝒦mx_{m}^{\mathrm{GB}}\in x_{0}+\mathscr{K}_{m} which minimizes the backward error in the matrix AA, assuming Δb=0\Delta_{b}=0,

minxmx0+𝒳mΔAF, s.t. (AΔA)xm=b\min_{x_{m}\in x_{0}+\mathscr{X}_{m}}\left\|\Delta_{A}\right\|_{F},\quad\text{ s.t. }\left(A-\Delta_{A}\right)x_{m}=b\text{. }

The following steps are performed for determining xmGBx_{m}^{\mathrm{GB}}.
(i) Arnoldi Process. Compute the matrices VmV_{m} and H¯m\bar{H}_{m}.
(ii) Gmback Solution. Execute Steps (a), (b), (c), (d) below.
(a) Let β=r02\beta=\left\|r_{0}\right\|_{2},

H^m=[βe1,H¯m](m+1)×(m+1),\displaystyle\hat{H}_{m}=\left[-\beta e_{1},\bar{H}_{m}\right]\in\mathbb{R}^{(m+1)\times(m+1)},
G^m=[x0,Vm]N×(m+1),\displaystyle\hat{G}_{m}=\left[x_{0},V_{m}\right]\in\mathbb{R}^{N\times(m+1)},
P=H^mtH^m(m+1)×(m+1),\displaystyle P=\hat{H}_{m}^{t}\hat{H}_{m}\in\mathbb{R}^{(m+1)\times(m+1)},
Q=G^mtG^m(m+1)×(m+1).\displaystyle Q=\hat{G}_{m}^{t}\hat{G}_{m}\in\mathbb{R}^{(m+1)\times(m+1)}.

(b) Determine an eigenvector um+1u_{m+1} corresponding to the smallest eigenvalue λm+1GB\lambda_{m+1}^{\mathrm{GB}} of the generalized eigenproblem Pu=λQu\mathrm{Pu}=\lambda Qu.
(c) If the first component um+1(1)u_{m+1}^{(1)} is nonzero, compute the vector ymGBmy_{m}^{\mathrm{GB}}\in\mathbb{R}^{m} by scaling um+1u_{m+1} such that

[1ymGB]=[1/um+1(1)]um+1\left[\begin{array}[]{c}1\\ y_{m}^{\mathrm{GB}}\end{array}\right]=\left[1/u_{m+1}^{(1)}\right]u_{m+1}

(d) Set xmGB=x0+VmymGBx_{m}^{\mathrm{GB}}=x_{0}+V_{m}y_{m}^{\mathrm{GB}}.

This algorithm may lead to two possible breakdowns, either in the Arnoldi method or in the scaling of um+1u_{m+1}. As in the case of Gmres, the
first breakdown is a happy breakdown, because the solution may be determined exactly using H¯m\bar{H}_{m} and VmV_{m}. The second breakdown appears when all the eigenvectors associated to λm+1GB\lambda_{m+1}^{\mathrm{GB}} have the first component zero, the inevitable divisions by zero leading to uncircumventible breakdowns. In such case, either mm is increased or the algorithm is restarted with a different initial approximation x0x_{0}. We shall assume in the following analysis that xmGBx_{m}^{\mathrm{GB}} exists.

Kasenally proved that, for any x0Nx_{0}\in\mathbb{R}^{N} and m{1,,N}m\in\{1,\ldots,N\}, the backward error ΔA,mGB\Delta_{A,m}^{\mathrm{GB}} corresponding to the Gmback solution satisfies

ΔA,mGBF=λm+1GB.\left\|\Delta_{A,m}^{\mathrm{GB}}\right\|_{F}=\sqrt{\lambda_{m+1}^{\mathrm{GB}}}. (6)

The Newton-Gmback iterates may be written in two equivalent ways, taking into account the properties of the Krylov solutions,

F(yk)skGB=F(yk)+rkGB,\displaystyle F^{\prime}\left(y_{k}\right)s_{k}^{\mathrm{GB}}=-F\left(y_{k}\right)+r_{k}^{\mathrm{GB}},
[F(yk)ΔAkGB]skGB=F(yk),\displaystyle{\left[F^{\prime}\left(y_{k}\right)-\Delta_{A_{k}}^{\mathrm{GB}}\right]s_{k}^{\mathrm{GB}}=-F\left(y_{k}\right),}

k=0,1,,y0Dk=0,1,\ldots,y_{0}\in D, where we considered

Ak=F(yk) and bk=F(yk).A_{k}=F^{\prime}\left(y_{k}\right)\quad\text{ and }\quad b_{k}=-F\left(y_{k}\right).

There are three results which may be applied to characterize the high convergence orders of these sequences (the corresponding statements are left to the reader): Theorem 2.1 of Dembo, Eisenstat, and Steihaug; the results of Dennis and Moré for the quasi-Newton methods (see Refs. 17-18); and Theorem 2.2 for the (IPNM) iterates, in which we must take δk=r^k=0\delta_{k}=\hat{r}_{k}=0, k=0,1,k=0,1,\ldots. Naturally, these results must be equivalent, but here we do not analyze this aspect.

Concerning the sufficient conditions, the convergence orders of the Newton-Gmback method may be controlled by the computed eigenvalues λkGB\lambda_{k}^{\mathrm{GB}} from Gmback.

Theorem 4.2. Consider the sequence of Newton-Gmback iterates yk+1=yk+skGBy_{k+1}=y_{k}+s_{k}^{\mathrm{GB}}, where skGBs_{k}^{\mathrm{GB}} satisfies

[F(yk)ΔAkGB]skGB=F(yk),k=0,1,\left[F^{\prime}\left(y_{k}\right)-\Delta_{A_{k}}^{\mathrm{GB}}\right]s_{k}^{\mathrm{GB}}=-F\left(y_{k}\right),\quad k=0,1,\ldots

Assume the following:
(a) Conditions (C1)-(C4) hold.
(b) The derivative FF^{\prime} is Hölder continuous with exponent pp at yy^{*}.
(c) The sequence (yk)k0\left(y_{k}\right)_{k\geq 0} converges to yy^{*}.

Moreover, if

λkGB=𝒪(F(yk)p), as k,\sqrt{\lambda_{k}^{\mathrm{GB}}}=\mathscr{O}\left(\left\|F\left(y_{k}\right)\right\|^{p}\right),\quad\text{ as }k\rightarrow\infty,

then the Newton-Gmback iterates converge with qq-order at least 1+p1+p.

Proof. The proof follows from Corollary 3.1, taking into account formula (6), the inequality Z2ZF\|Z\|_{2}\leq\|Z\|_{F}, true for all ZN×NZ\in\mathbb{R}^{N\times N}, and the fact that all the norms are equivalent on a finite-dimensional normed space.

We are interested now if the curve of the Newton-Gmback errors is nonincreasing when starting with a sufficiently good approximation. In the following, we shall construct an example which shows that, unlike the Newton-Gmres case, this property is not generally shared by the NewtonGmback method when Gmback is used in the nonrestarted version.

Example 4.2. Consider the system F(y)=0F(y)=0, with

F:NN,\displaystyle F:\mathbb{R}^{N}\rightarrow\mathbb{R}^{N},
F(y)=[y(1)+y(N),y(1),y(2),,y(N1)]t,\displaystyle F(y)=\left[y^{(1)}+y^{(N)},y^{(1)},y^{(2)},\ldots,y^{(N-1)}\right]^{t},
y=[y(1),,y(N)]tN,\displaystyle y=\left[y^{(1)},\ldots,y^{(N)}\right]^{t}\in\mathbb{R}^{N},

having the unique solution y=0y^{*}=0. For all yNy\in\mathbb{R}^{N}, the derivative of FF is given by

F(y)[e1+e2,e3,,eN,e1]=A.F^{\prime}(y)\equiv\left[e_{1}+e_{2},e_{3},\ldots,e_{N},e_{1}\right]=A.

Taking y0=heNy_{0}=-he_{N} with arbitrarily small h0h\neq 0, we must solve

F(y0)s=F(y0),F^{\prime}\left(y_{0}\right)s=-F\left(y_{0}\right),

or equivalently,

Ax=b, with b=he1.Ax=b,\quad\text{ with }b=he_{1}.

Applying mN1m\leq N-1 steps of the Arnoldi process with x0=0x_{0}=0, we obtain successively

Vm=[e1,,em]N×m,\displaystyle V_{m}=\left[e_{1},\ldots,e_{m}\right]\in\mathbb{R}^{N\times m},
H¯m=[e1+e2,e3,,em+1](m+1)×m,\displaystyle\bar{H}_{m}=\left[e_{1}+e_{2},e_{3},\ldots,e_{m+1}\right]\in\mathbb{R}^{(m+1)\times m},
P=H^mtH^m=[h2e1he2,2e2he1,e3,,em+1],\displaystyle P=\hat{H}_{m}^{t}\hat{H}_{m}=\left[h^{2}e_{1}-he_{2},2e_{2}-he_{1},e_{3},\ldots,e_{m+1}\right],
Q=G^mtG^m=[0,e2,,em+1](m+1)×(m+1).\displaystyle Q=\hat{G}_{m}^{t}\hat{G}_{m}=\left[0,e_{2},\ldots,e_{m+1}\right]\in\mathbb{R}^{(m+1)\times(m+1)}.

The eigenpair ( um+1,λm+1GBu_{m+1},\lambda_{m+1}^{\mathrm{GB}} ) of

Pu=λQuPu=\lambda Qu

is determined uniquely by

um+1=[1/h,1,0,,0]t,λm+1GB=1.u_{m+1}=[1/h,1,0,\ldots,0]^{t},\quad\lambda_{m+1}^{\mathrm{GB}}=1.

It follows that

ymGB=he1m,y_{m}^{\mathrm{GB}}=he_{1}\in\mathbb{R}^{m},

and the Gmback, solution is

xmGB=he1,x_{m}^{\mathrm{GB}}=he_{1},

such that, when N3N\geq 3,

y1y2=2|h|>|h|=y0y2.\left\|y_{1}-y^{*}\right\|_{2}=\sqrt{2}|h|>|h|=\left\|y_{0}-y^{*}\right\|_{2}.

4.3. Newton-Minpert Method. The Minpert method for solving (4) was introduced by Kasenally and Simoncini in Ref. 40. Given an initial approximation x0Nx_{0}\in\mathbb{R}^{N}, it computes an element xmMPx0+𝒦mx_{m}^{\mathrm{MP}}\in x_{0}+\mathscr{K}_{m} which minimizes the joint backward error,

minxmx0+𝒦𝓂m[ΔA,Δb]F, s.t. (AΔA)xm=b+Δb.\min_{x_{m}\in x_{0}+\mathscr{Km}_{m}}\left\|\left[\Delta_{A},\Delta_{b}\right]\right\|_{F},\quad\text{ s.t. }\left(A-\Delta_{A}\right)x_{m}=b+\Delta_{b}. (7)

In other words, xmMPx_{m}^{\mathrm{MP}} minimizes the distance from the original system to the nearest one that an approximation xmx0+𝒦mx_{m}\in x_{0}+\mathscr{K}_{m} actually satisfies. The above minimization problem may also be viewed as a total least squares problem in the Krylov subspace (see Ref. 40).

The algorithm is similar to Gmback, the only difference being given by the computation of the matrices from the eigenproblem

Pu=λQu.Pu=\lambda Qu.

The following is a sketch of the steps performed by Minpert.
(i) Arnoldi Process. Compute the matrices VmV_{m} and H¯m\bar{H}_{m}.
(ii) Minpert Solution.
(a) Let β=r02\beta=\left\|r_{0}\right\|_{2},

H^m=[βe1,H¯m](m+1)×(m+1),\displaystyle\hat{H}_{m}=\left[-\beta e_{1},\bar{H}_{m}\right]\in\mathbb{R}^{(m+1)\times(m+1)},
G^m=[x0Vm10](N+1)×(m+1),\displaystyle\hat{G}_{m}=\left[\begin{array}[]{ll}x_{0}&V_{m}\\ 1&0\end{array}\right]\in\mathbb{R}^{(N+1)\times(m+1)},
P=H^mtH^m(m+1)×(m+1),\displaystyle P=\hat{H}_{m}^{t}\hat{H}_{m}\in\mathbb{R}^{(m+1)\times(m+1)},
Q=G^mtG^m(m+1)×(m+1).\displaystyle Q=\hat{G}_{m}^{t}\hat{G}_{m}\in\mathbb{R}^{(m+1)\times(m+1)}.

(b) Determine an eigenvector um+1u_{m+1} corresponding to the smallest eigenvalue λm+1MP\lambda_{m+1}^{\mathrm{MP}} of the generalized eigenproblem Pu=λQu\mathrm{Pu}=\lambda Qu.
(c) If the first component um+1(1)u_{m+1}^{(1)} is nonzero, compute the vector ymMPmy_{m}^{\mathrm{MP}}\in\mathbb{R}^{m} by scaling um+1u_{m+1} such that

[1ymMP]=[1/um+1(1)]um+1\left[\begin{array}[]{l}1\\ y_{m}^{\mathrm{MP}}\end{array}\right]=\left[1/u_{m+1}^{(1)}\right]u_{m+1}

(d) Set xmMP=x0+VmymMPx_{m}^{\mathrm{MP}}=x_{0}+V_{m}y_{m}^{\mathrm{MP}}.

The remarks concerning the breakdowns of the Gmback method hold also for Minpert.

Kasenally and Simoncini proved that, for any x0Nx_{0}\in\mathbb{R}^{N} and m{1,,N}m\in\{1,\ldots,N\}, the following relations hold:

[ΔA,mMP,Δb,mMP]F=λm+1MP\displaystyle\left\|\left[\Delta_{A,m}^{\mathrm{MP}},\Delta_{b,m}^{\mathrm{MP}}\right]\right\|_{F}=\sqrt{\lambda_{m+1}^{\mathrm{MP}}} (8)
rmMP2=λm+1MP[(xmMP)t,1]t2\displaystyle\left\|r_{m}^{\mathrm{MP}}\right\|_{2}=\sqrt{\lambda_{m+1}^{\mathrm{MP}}}\left\|\left[\left(x_{m}^{\mathrm{MP}}\right)^{t},1\right]^{t}\right\|_{2} (9)

where ΔA,mMP,Δb,mMP\Delta_{A,m}^{\mathrm{MP}},\Delta_{b,m}^{\mathrm{MP}} and rmMPr_{m}^{\mathrm{MP}} represent the backward errors and the residual corresponding to the Minpert solution xmMPx_{m}^{\mathrm{MP}}.

We shall prove for Minpert a result somehow similar to Proposition 4.1.

Proposition 4.3. Consider the linear system (4), the initial approximation x0=0x_{0}=0, and an arbitrary value m{1,,N}m\in\{1,\ldots,N\}. If there exists a Minpert solution xmMPx_{m}^{\mathrm{MP}}, then its joint backward error satisfies

[ΔA,mMP,Δb,mMP]Fb2.\left\|\left[\Delta_{A,m}^{\mathrm{MP}},\Delta_{b,m}^{\mathrm{MP}}\right]\right\|_{F}\leq\|b\|_{2}.

Proof. When x0=0x_{0}=0, as noticed in Ref. 40, we are led to a regular eigenproblem in the Minpert algorithm, since Q=Im+1Q=I_{m+1}. The boundedness of the Rayleigh quotient then implies

λm+1MP=minzm+1ztPz/ztze1tPe1/e1te1=e1tH^mtH^me1=β2=b22\lambda_{m+1}^{\mathrm{MP}}=\min_{z\in\mathbb{R}^{m+1}}z^{t}Pz/z^{t}z\leq e_{1}^{t}Pe_{1}/e_{1}^{t}e_{1}=e_{1}^{t}\hat{H}_{m}^{t}\hat{H}_{m}e_{1}=\beta^{2}=\|b\|_{2}^{2}

so, by (8), the conclusion is immediate.

The other two affirmations similar to those from Proposition 4.1 may be stated correspondingly with the single remark that, since the solution of the minimization problem (7) may not be uniquely determined, the inequality between two consecutive joint backward errors may not be strict even if the consecutive solutions xmMPx_{m}^{\mathrm{MP}} and xm+1MPx_{m+1}^{\mathrm{MP}} are distinct.

We also note that, when the initial approximation in Gmback is taken x0=0x_{0}=0, one cannot similarly use the vector e1e_{1} in the Fisher result [see, e.g., Ref. 41, Corollary VI.1.16, which says that λm+1GB=minzm+1ztPz/ztQz]\left.\lambda_{m+1}^{\mathrm{GB}}=\min_{z\in\mathbb{R}^{m+1}}z^{t}Pz/z^{t}Qz\right] in
order to bound λm+1GB\lambda_{m+1}^{\mathrm{GB}} by β2\beta^{2}, since in this case

e1tQe1=0e_{1}^{t}Qe_{1}=0

i.e., e1e_{1} is an eigenvector corresponding to the infinite eigenvalue λ1=+\lambda_{1}=+\infty. Such a result could not be expected to hold in general, because by Corollary 3.1 it would imply that, for an arbitrary mapping FF satisfying Conditions (C1)-(C3) and with Lipschitz derivative, the Newton-Gmback iterations with x0=0x_{0}=0 in Gmback and obeying Condition (C4) converge locally with qq-order 2 even for one-dimensional Krylov subspaces.

The convergence orders of the Newton-Minpert iterates may be characterized by (9) in terms of the computed eigenvalues λkMP\lambda_{k}^{\mathrm{MP}}.

Theorem 4.3. Assume that Conditions (C1)-(C3) hold, that the derivative FF^{\prime} is Hölder continuous with exponent pp at yy^{*}, and that the sequence (yk)k0\left(y_{k}\right)_{k\geq 0} given by the Newton-Minpert iterations yk+1=yk+skMPy_{k+1}=y_{k}+s_{k}^{\mathrm{MP}}, with

F(yk)skMP=F(yk)+rkMP,k=0,1,,y0D,F^{\prime}\left(y_{k}\right)s_{k}^{\mathrm{MP}}=-F\left(y_{k}\right)+r_{k}^{\mathrm{MP}},\quad k=0,1,\ldots,\quad y_{0}\in D,

converges to yy^{*}. Then, (yk)k0\left(y_{k}\right)_{k\geq 0} converges with qq-order at least 1+p1+p if and only if

λkMP=𝒪(F(yk)1+p), as k.\sqrt{\lambda_{k}^{\mathrm{MP}}}=\mathscr{O}\left(\left\|F\left(y_{k}\right)\right\|^{1+p}\right),\quad\text{ as }k\rightarrow\infty.

The direct application of Proposition 4.3 to the study of the monotonicity of the Newton-Minpert errors leads to a result similar to Theorem 4.1, but which requires some additional conditions.

Theorem 4.4. Assume that the mapping FF satisfies Conditions (C1)(C3), and consider a current approximation ycyy_{c}\neq y^{*} for yy^{*}. Let sMPs^{\mathrm{MP}} be a nonzero approximate solution of the linear system F(yc)s=F(yc)F^{\prime}\left(y_{c}\right)s=-F\left(y_{c}\right) provided by Minpert with the initial guess x0=0x_{0}=0. Let

y+=yc+sMPy_{+}=y_{c}+s^{\mathrm{MP}}

If

η=λMP[(sMP)t,1]t2/F(yc)2<1,\eta=\sqrt{\lambda^{\mathrm{MP}}}\left\|\left[\left(s^{\mathrm{MP}}\right)^{t},1\right]^{t}\right\|_{2}/\left\|F\left(y_{c}\right)\right\|_{2}<1,

and if ycy_{c} is sufficiently close to yy^{*}, then

y+ytycy, for all t(η,1),\left\|y_{+}-y^{*}\right\|_{*}\leq t\left\|y_{c}-y^{*}\right\|_{*},\quad\text{ for all }t\in(\eta,1),

where y=F(y)y2\|y\|_{*}=\left\|F^{\prime}\left(y^{*}\right)y\right\|_{2}.
The following example shows that the Newton-Minpert iterates may stagnate no matter how close to the solution yy^{*} we choose y0y_{0}.

Example 4.3. Consider the mapping FF from Example 4.1, which leads us again to solving the linear system Ax=bAx=b. The Minpert method with x0=0x_{0}=0 and m{1,,N1}m\in\{1,\ldots,N-1\} yields the diagonal matrix P(m+1)×(m+1)P\in\mathbb{R}^{(m+1)\times(m+1)}, with diag(P)=[h2,1,,1]\operatorname{diag}(P)=\left[h^{2},1,\ldots,1\right] and Q=Im+1Q=I_{m+1}, such that, for any hh with 0<|h|<10<|h|<1, we get

λm+1MP=h2 and um+1=e1.\lambda_{m+1}^{\mathrm{MP}}=h^{2}\quad\text{ and }\quad u_{m+1}=e_{1}.

Then, the unique choice ymMP=0y_{m}^{\mathrm{MP}}=0 follows, and so s0,mMP=0s_{0,m}^{\mathrm{MP}}=0.
Remark 4.1. This example also shows that, for the linear systems Ax=bAx=b above, the Minpert method behaves identically with Gmres; i.e., for x0=0x_{0}=0 and m¯,m{1,,N1}\bar{m},m\in\{1,\ldots,N-1\}, it cannot improve the accuracy of the approximate solution, even in the restarted version. The systems with the same matrix AA, but with b=he1,|h|>1b=he_{1},|h|>1, constitute some examples when the Minpert method for a linear system leads to uncircumventible breakdowns. The stagnations and the uncircumventible breakdowns of Minpert were theoretically known to be possible, but we had not encountered concrete examples before.

We end the theoretical considerations by applying a result which relates the Gmres and Minpert approximations to the Newton iterations. However, we do not tackle here this problem in its general setting nor do we analyze the conditions required in this result.

Kasenally and Simoncini proved that the difference between the Gmres and Minpert solutions may be bounded under certain circumstances.

Theorem 4.5. See Ref. 40. Consider the same arbitrary elements m{1,,N}m\in\{1,\ldots,N\} and x0Nx_{0}\in\mathbb{R}^{N} in the Gmres and Minpert methods for solving the linear system (4). Denote by σm2\sigma_{m}^{2} the smallest eigenvalue of H¯mtH¯m\bar{H}_{m}^{t}\bar{H}_{m}, i.e., the smallest squared singular value of H¯m\bar{H}_{m}, and assume that σm2λm+1MP\sigma_{m}^{2}\neq\lambda_{m+1}^{\mathrm{MP}}. Then,

xmMPxmGM2[λm+1MP/(σm2λm+1MP)]xmGM2.\left\|x_{m}^{\mathrm{MP}}-x_{m}^{\mathrm{GM}}\right\|_{2}\leq\left[\lambda_{m+1}^{\mathrm{MP}}/\left(\sigma_{m}^{2}-\lambda_{m+1}^{\mathrm{MP}}\right)\right]\left\|x_{m}^{\mathrm{GM}}\right\|_{2}.

An auxiliary result proved by these authors shows that the eigenvalues determined at the step mm interlace in the following way:

λiMPσi2λi+1MP, for i=1,,m.\lambda_{i}^{\mathrm{MP}}\geq\sigma_{i}^{2}\geq\lambda_{i+1}^{\mathrm{MP}},\quad\text{ for }i=1,\ldots,m.

The bound from the above theorem is large whenever σm2\sigma_{m}^{2} is close to λm+1MP\lambda_{m+1}^{\mathrm{MP}} and the result cannot be applied when σm2=λm+1MP\sigma_{m}^{2}=\lambda_{m+1}^{\mathrm{MP}}; such a situation arises for example when λm+1MP\lambda_{m+1}^{\mathrm{MP}} is a repeated eigenvalue.

Theorem 4.5 may determine theoretical possibilities when Gmres and Minpert have the same asymptotical behavior when used in a converging Newton method.

Theorem 4.6. Assume that Conditions (C1)-(C3) hold and that the sequence (ykGM)k0\left(y_{k}^{\mathrm{GM}}\right)_{k\geq 0} given by the Newton-Gmres method converges to yy^{*}, where the elements from the Gmres algorithm are arbitrary at each outer step. Denote by skGM s_{k}^{\text{GM }} the corrections obtained from the linear systems

F(ykGM)s=F(ykGM),F^{\prime}\left(y_{k}^{\mathrm{GM}}\right)s=-F\left(y_{k}^{\mathrm{GM}}\right),

and consider the approximate solutions skMPs_{k}^{\mathrm{MP}} of these linear systems obtained with Minpert, which is assumed to use the same initial approximations, the same subspace dimensions, and the same number of restarts as Gmres. If σk2λkMP\sigma_{k}^{2}\neq\lambda_{k}^{\mathrm{MP}}, for k=0,1,k=0,1,\ldots, and λkMP=o(σk2)\lambda_{k}^{\mathrm{MP}}=o\left(\sigma_{k}^{2}\right), as kk\rightarrow\infty, then Gmres and Minpert yield asymptotically the same normalized corrections:

[1/skGM2]skGM[1/skMP2]skMP0, as k.\left[1/\left\|s_{k}^{\mathrm{GM}}\right\|_{2}\right]s_{k}^{\mathrm{GM}}-\left[1/\left\|s_{k}^{\mathrm{MP}}\right\|_{2}\right]s_{k}^{\mathrm{MP}}\rightarrow 0,\quad\text{ as }k\rightarrow\infty.

The same result holds inverting the role of Gmres and Minpert.
Proof. The hypotheses of the theorem imply that

λkMP0, as k.\lambda_{k}^{\mathrm{MP}}\rightarrow 0,\quad\text{ as }k\rightarrow\infty.

For the nonrestarted version of the Krylov solvers, the affirmation is a straightforward application of Theorem 4.5, while for the restarted version, the conclusion is obtained by an inductive argument on the number of restarts from each outer step.

5. Numerical Examples

We shall consider two test problems from Refs. 34 and 42, that will provide some nonlinear systems in N\mathbb{R}^{N}. We shall apply to them the studied Newton-Krylov methods. The Krylov solvers are considered in the nonrestarted version and the initial guess is taken 0 in both the inner and outer iterations. We are interested in the behavior of the magnitude of the backward errors of these methods and in verifying theoretical results; therefore, we are not aiming to implement some efficient methods for solving the problems under consideration.
5.1. Bratu Problem. Consider the nonlinear partial differential equation

Δu+αux+λeu=f-\Delta u+\alpha u_{x}+\lambda e^{u}=f

over the unit square of 2\mathbb{R}^{2}, with Dirichlet boundary conditions. As mentioned in Refs. 34 and 42, this is a standard problem, a simplified form of which is known as the Bratu problem (see Ref. 43). We have discretized by means of 5-point finite differences [respectively, by means of central finite differences] on a uniform mesh, obtaining a system of nonlinear equations of size N=(n2)2N=(n-2)^{2}, where nn is the number of mesh points in each direction. As in Ref. 34, we took ff such that the solution of the discretized problem is the constant unity. We have considered

α=10,λ=1,N=1024,m=10.\alpha=10,\quad\lambda=1,\quad N=1024,\quad m=10.

The runs were made on a PC, using Matlab Version 4.0. The symbol \|\cdot\| denotes either the Euclidean norm or the Frobenius norm, and ek,Fke_{k},F_{k} stand for yyk,F(yk)y^{*}-y_{k},F\left(y_{k}\right).

Table 1 contains the results obtained by using the Newton-Gmres method. We have also considered the corrections obtained with Minpert at each step, denoting by aka_{k} the norm of the difference (1/skGM)skGM(1/skMP)skMP\left(1/\left\|s_{k}^{\mathrm{GM}}\right\|\right)s_{k}^{\mathrm{GM}}-\left(1/\left\|s_{k}^{\mathrm{MP}}\right\|\right)s_{k}^{\mathrm{MP}}; the corrections skMPs_{k}^{\mathrm{MP}} were computed only for the comparison. It can be seen that the normalized corrections agree the closer the iterations get to the solution.

Table 2 contains the results obtained by using the Newton-Gmback method. We notice the rather constant magnitudes of ΔAkGB\Delta_{A_{k}}^{\mathrm{GB}}, which do not approach zero even when the iterates are close to the solution.

In Table 3, we have denoted

bk=sk,10GMsk,10MP2/sk,10GM2,ck=λk,10MP/(σk,102λk,10MP).b_{k}=\left\|s_{k,10}^{\mathrm{GM}}-s_{k,10}^{\mathrm{MP}}\right\|_{2}/\left\|s_{k,10}^{\mathrm{GM}}\right\|_{2},\quad c_{k}=\lambda_{k,10}^{\mathrm{MP}}/\left(\sigma_{k,10}^{2}-\lambda_{k,10}^{\mathrm{MP}}\right).

The bounds ckc_{k} seem to be tight here for bkb_{k}. As soon as the iterations approach the solution, the inequalities from Theorem 4.5 cease to hold numerically; we believe that this is due to the different types of errors which

Table 1: Table 1. Newton-Gmres applied to the Bratu problem.
kk ek\left\|e_{k}\right\| Fk2\left\|F_{k}\right\|^{2} rk\left\|r_{k}\right\| aka_{k} kk ek\left\|e_{k}\right\| Fk2\left\|F_{k}\right\|^{2} rk\left\|r_{k}\right\| aka_{k}
0 3e+1 1e+2 9e-1 4e24\mathrm{e}-2 11 3e-4 1e-10 8e-6 1e-9
1 2e+12\mathrm{e}+1 9e-1 5e15\mathrm{e}-1 3e23\mathrm{e}-2 12 2e42\mathrm{e}-4 7e-11 2e62\mathrm{e}-6 4e114\mathrm{e}-11
2 1e+11\mathrm{e}+1 3e-1 4e14\mathrm{e}-1 2e22\mathrm{e}-2 13 3e53\mathrm{e}-5 8e128\mathrm{e}-12 7e77\mathrm{e}-7 4e124\mathrm{e}-12
3 1e+11\mathrm{e}+1 1e11\mathrm{e}-1 2e12\mathrm{e}-1 9e39\mathrm{e}-3 14 1e51\mathrm{e}-5 5e135\mathrm{e}-13 4e74\mathrm{e}-7 3e123\mathrm{e}-12
4 6e+0 8e28\mathrm{e}-2 3e23\mathrm{e}-2 9e59\mathrm{e}-5 15 1e51\mathrm{e}-5 2e132\mathrm{e}-13 2e72\mathrm{e}-7 3e133\mathrm{e}-13
5 1e-1 1e-3 6e36\mathrm{e}-3 2e42\mathrm{e}-4 16 1e-6 4e144\mathrm{e}-14 2e82\mathrm{e}-8 1e141\mathrm{e}-14
6 5e25\mathrm{e}-2 4e54\mathrm{e}-5 2e32\mathrm{e}-3 5e55\mathrm{e}-5 17 5e75\mathrm{e}-7 6e166\mathrm{e}-16 1e81\mathrm{e}-8 5e-14
7 2e22\mathrm{e}-2 5e65\mathrm{e}-6 6e46\mathrm{e}-4 2e62\mathrm{e}-6 18 4e74\mathrm{e}-7 1e161\mathrm{e}-16 6e96\mathrm{e}-9 3e153\mathrm{e}-15
8 1e21\mathrm{e}-2 4e74\mathrm{e}-7 3e-4 2e62\mathrm{e}-6 19 8e88\mathrm{e}-8 4e174\mathrm{e}-17 5e105\mathrm{e}-10 2e152\mathrm{e}-15
9 1e-2 1e71\mathrm{e}-7 1e41\mathrm{e}-4 2e72\mathrm{e}-7 20 9e99\mathrm{e}-9 2e-19 2e102\mathrm{e}-10 4e144\mathrm{e}-14
10 1e-3 2e82\mathrm{e}-8 1e-5 4e104\mathrm{e}-10 21 6e96\mathrm{e}-9 5e-20 9e119\mathrm{e}-11 2e162\mathrm{e}-16
Table 2: Table 2. Newton-Gmback applied to the Bratu problem.
kk ek\left\|e_{k}\right\| Fk\left\|F_{k}\right\| ΔAkGB\left\|\Delta_{A_{k}}^{\mathrm{GB}}\right\| rkr_{k} kk ek\left\|e_{k}\right\| Fk\left\|F_{k}\right\| ΔAkGB\left\|\Delta_{A_{k}}^{\mathrm{GB}}\right\| rkr_{k}
0 3e+13\mathrm{e}+1 1e+1 6e-2 1e+0 11 1e-4 2e-5 4e-2 4e-6
1 2e+12\mathrm{e}+1 1e+0 5e-2 7e-1 12 8e58\mathrm{e}-5 4e64\mathrm{e}-6 5e25\mathrm{e}-2 3e63\mathrm{e}-6
2 1e+11\mathrm{e}+1 7e-1 6e-2 5e15\mathrm{e}-1 13 3e53\mathrm{e}-5 3e63\mathrm{e}-6 1e-2 5e75\mathrm{e}-7
3 9e+09\mathrm{e}+0 5e15\mathrm{e}-1 6e26\mathrm{e}-2 3e-1 14 8e68\mathrm{e}-6 5e75\mathrm{e}-7 6e26\mathrm{e}-2 3e73\mathrm{e}-7
4 4e+04\mathrm{e}+0 3e-1 2e22\mathrm{e}-2 7e27\mathrm{e}-2 15 4e64\mathrm{e}-6 3e73\mathrm{e}-7 2e22\mathrm{e}-2 8e88\mathrm{e}-8
5 1e+0 7e27\mathrm{e}-2 4e24\mathrm{e}-2 6e26\mathrm{e}-2 16 1e61\mathrm{e}-6 8e88\mathrm{e}-8 5e25\mathrm{e}-2 5e85\mathrm{e}-8
6 5e15\mathrm{e}-1 6e26\mathrm{e}-2 2e22\mathrm{e}-2 1e-2 17 5e75\mathrm{e}-7 5e85\mathrm{e}-8 2e22\mathrm{e}-2 1e81\mathrm{e}-8
7 2e-1 1e-2 1e21\mathrm{e}-2 4e34\mathrm{e}-3 18 2e72\mathrm{e}-7 1e81\mathrm{e}-8 3e23\mathrm{e}-2 6e96\mathrm{e}-9
8 1e21\mathrm{e}-2 4e34\mathrm{e}-3 5e25\mathrm{e}-2 5e-4 19 3e83\mathrm{e}-8 6e96\mathrm{e}-9 2e22\mathrm{e}-2 7e107\mathrm{e}-10
9 4e34\mathrm{e}-3 5e-4 4e24\mathrm{e}-2 9e-5 20 1e81\mathrm{e}-8 7e-10 2e22\mathrm{e}-2 3e103\mathrm{e}-10
10 2e32\mathrm{e}-3 9e-5 1e-2 2e-5 21 1e91\mathrm{e}-9 3e103\mathrm{e}-10 1e21\mathrm{e}-2 1e-11
Table 3: Table 3. Newton-Minpert applied to the Bratu problem.
kk ek\left\|e_{k}\right\| Fk\left\|F_{k}\right\| λk,10MP\sqrt{\lambda_{k,10}^{\mathrm{MP}}} Fk2\left\|F_{k}\right\|^{2} rk\left\|r_{k}\right\| bkb_{k} ckc_{k}
0 3e+13\mathrm{e}+1 1e+1 6e-2 1e+21\mathrm{e}+2 1e+0 8.3594e28.3594\mathrm{e}-2 1.0364e11.0364\mathrm{e}-1
1 2e+12\mathrm{e}+1 1e+0 5e-2 1e+0 7e-1 6.6518e16.6518\mathrm{e}-1 6.7180e16.7180\mathrm{e}-1
2 1e+11\mathrm{e}+1 7e-1 6e-2 5e15\mathrm{e}-1 5e15\mathrm{e}-1 7.3033e17.3033\mathrm{e}-1 7.3995e17.3995\mathrm{e}-1
3 9e+0 5e15\mathrm{e}-1 6e26\mathrm{e}-2 2e12\mathrm{e}-1 3e-1 6.1259e16.1259\mathrm{e}-1 6.2053e16.2053\mathrm{e}-1
4 4e+04\mathrm{e}+0 3e13\mathrm{e}-1 2e22\mathrm{e}-2 1e11\mathrm{e}-1 7e27\mathrm{e}-2 4.4297e24.4297\mathrm{e}-2 4.4403e24.4403\mathrm{e}-2
5 1e+0 7e27\mathrm{e}-2 3e23\mathrm{e}-2 5e35\mathrm{e}-3 5e-2 3.3261e13.3261\mathrm{e}-1 3.3456e13.3456\mathrm{e}-1
6 8e18\mathrm{e}-1 5e25\mathrm{e}-2 1e-2 2e32\mathrm{e}-3 1e-2 3.0693e23.0693\mathrm{e}-2 3.0808e-2
7 3e13\mathrm{e}-1 1e-2 5e35\mathrm{e}-3 2e42\mathrm{e}-4 5e35\mathrm{e}-3 1.5905e21.5905\mathrm{e}-2 1.6004e21.6004\mathrm{e}-2
8 6e26\mathrm{e}-2 5e35\mathrm{e}-3 1e-3 3e53\mathrm{e}-5 1e31\mathrm{e}-3 2.0734e42.0734\mathrm{e}-4 2.0883e42.0883\mathrm{e}-4
9 2e22\mathrm{e}-2 1e-3 7e-4 2e62\mathrm{e}-6 7e47\mathrm{e}-4 1.1074e41.1074\mathrm{e}-4 1.1233e41.1233\mathrm{e}-4
10 1e21\mathrm{e}-2 7e47\mathrm{e}-4 1e-4 5e75\mathrm{e}-7 1e-4 1.0523e51.0523\mathrm{e}-5 1.0526e51.0526\mathrm{e}-5
11 2e32\mathrm{e}-3 1e41\mathrm{e}-4 7e57\mathrm{e}-5 3e-8 7e57\mathrm{e}-5 5.2230e75.2230\mathrm{e}-7 5.3930e75.3930\mathrm{e}-7
12 1e31\mathrm{e}-3 7e57\mathrm{e}-5 6e-6 5e95\mathrm{e}-9 6e-6 2.6605e82.6605\mathrm{e}-8 2.6605e82.6605\mathrm{e}-8
13 4e54\mathrm{e}-5 6e-6 1e-6 4e114\mathrm{e}-11 1e-6 3.1452e113.1452\mathrm{e}-11 3.2821e113.2821\mathrm{e}-11
14 2e52\mathrm{e}-5 1e-6 6e76\mathrm{e}-7 1e121\mathrm{e}-12 6e76\mathrm{e}-7 1.6474e101.6474\mathrm{e}-10 1.6433e101.6433\mathrm{e}-10
15 9e69\mathrm{e}-6 6e76\mathrm{e}-7 1e71\mathrm{e}-7 3e133\mathrm{e}-13 1e71\mathrm{e}-7 4.9880e124.9880\mathrm{e}-12 5.0888e-12
16 2e62\mathrm{e}-6 1e71\mathrm{e}-7 8e88\mathrm{e}-8 2e142\mathrm{e}-14 8e88\mathrm{e}-8 1.6094e121.6094\mathrm{e}-12 1.6504e121.6504\mathrm{e}-12
17 1e61\mathrm{e}-6 8e88\mathrm{e}-8 1e-8 6e156\mathrm{e}-15 1e81\mathrm{e}-8 1.6952e131.6952\mathrm{e}-13 8.3076e148.3076\mathrm{e}-14
18 1e71\mathrm{e}-7 1e81\mathrm{e}-8 4e94\mathrm{e}-9 1e161\mathrm{e}-16 4e94\mathrm{e}-9 1.2156e141.2156\mathrm{e}-14 1.7097e151.7097\mathrm{e}-15
19 1e-7 4e94\mathrm{e}-9 1e91\mathrm{e}-9 2e172\mathrm{e}-17 1e91\mathrm{e}-9 4.7003e134.7003\mathrm{e}-13 9.6014e169.6014\mathrm{e}-16
20 5e95\mathrm{e}-9 1e91\mathrm{e}-9 2e112\mathrm{e}-11 1e181\mathrm{e}-18 2e112\mathrm{e}-11 2.2673e142.2673\mathrm{e}-14 2.2768e202.2768\mathrm{e}-20

have appeared in computing the eigenpairs, the Krylov approximations, and the elements bk,ckb_{k},c_{k}.
5.2. Driven Cavity Flow Problem. This is a classical problem from incompressible fluid flow. It has the following equations in the stream
function-vorticity formulation:

vΔω+ψyωxψxωy=0, in Ω,\displaystyle v\Delta\omega+\psi_{y}\omega_{x}-\psi_{x}\omega_{y}=0,\quad\text{ in }\Omega, (10)
Δψ=ω,\displaystyle-\Delta\psi=\omega, (11)
ψ=0,\displaystyle\psi=0,
(ψ/n)(x,y)|Ω={1, in Ω,0, if y=1,\displaystyle\left.(\partial\psi/\partial n)(x,y)\right|_{\partial\Omega}=\begin{cases}1,&\text{ in }\Omega,\\ 0,&\text{ if }y=1,\end{cases}
if 0y<1,\displaystyle\text{ if }0\leq y<1,

where Ω\Omega denotes again the interior of the unit square from 2\mathbb{R}^{2} and the viscosity ν\nu is the reciprocal of the Reynolds number ReRe. In terms of ψ\psi alone, (10) and (11) are replaced by

νΔ2ψ+ψy(Δψ)xψx(Δψ)y=0, in Ω\nu\Delta^{2}\psi+\psi_{y}(\Delta\psi)_{x}-\psi_{x}(\Delta\psi)_{y}=0,\quad\text{ in }\Omega

This equation was discretized again on a uniform mesh. We have considered n=25n=25, choosing the boundary conditions to be incorporated in the nonlinear system. We have used the symbolic facilities offered by Maple V Release 3.0 in order to determine the discretized equations. The Reynolds number was taken small ( Re=10Re=10 ) and the subspace dimensions of the Krylov methods were considered large ( m=50m=50 ), since the nonrestarted versions of these methods were not efficient for this problem.

Tables 454-5 contain the results of the three Newton-Krylov methods applied to this problem. We notice again that, close to the solution, the computed elements bkb_{k} and ckc_{k} do not satisfy the inequalities bkckb_{k}\leq c_{k}.

Table 4: Table 4. Newton-Gmres (left) and Newton-Gmback (right) applied to the driven cavity problem.
kk Fk\left\|F_{k}\right\| Fk2\left\|F_{k}\right\|^{2} rk\left\|r_{k}\right\| aka_{k} kk Fk\left\|F_{k}\right\| ΔAkGB\left\|\Delta_{A_{k}}^{\mathrm{GB}}\right\| Fk2\left\|F_{k}\right\|^{2} rkr_{k}
0 4e+04\mathrm{e}+0 2e+12\mathrm{e}+1 9e-1 2e-2 0 4e+04\mathrm{e}+0 6e-2 2e+12\mathrm{e}+1 9e19\mathrm{e}-1
1 8e+08\mathrm{e}+0 7e+17\mathrm{e}+1 1e+01\mathrm{e}+0 7e27\mathrm{e}-2 1 9e+0 3e-1 9e+19\mathrm{e}+1 1e+0
2 4e+04\mathrm{e}+0 2e+12\mathrm{e}+1 8e18\mathrm{e}-1 4e24\mathrm{e}-2 2 6e+0 3e-1 3e+13\mathrm{e}+1 1e+0
3 1e+0 3e+03\mathrm{e}+0 6e-1 4e24\mathrm{e}-2 3 1e+0 3e13\mathrm{e}-1 3e+0 9e19\mathrm{e}-1
4 7e17\mathrm{e}-1 5e15\mathrm{e}-1 5e15\mathrm{e}-1 1e21\mathrm{e}-2 4 9e19\mathrm{e}-1 2e12\mathrm{e}-1 8e18\mathrm{e}-1 8e18\mathrm{e}-1
\cdots \cdots
131 9e-5 8e98\mathrm{e}-9 8e58\mathrm{e}-5 6e96\mathrm{e}-9 131 4e54e-5 6e-1 1e91\mathrm{e}-9 2e52\mathrm{e}-5
132 8e58\mathrm{e}-5 7e97\mathrm{e}-9 8e58\mathrm{e}-5 7e97\mathrm{e}-9 132 2e52\mathrm{e}-5 4e14\mathrm{e}-1 5e105\mathrm{e}-10 3e53\mathrm{e}-5
133 8e58\mathrm{e}-5 6e96\mathrm{e}-9 7e57\mathrm{e}-5 5e95\mathrm{e}-9 133 3e53\mathrm{e}-5 6e-1 1e91\mathrm{e}-9 2e52\mathrm{e}-5
134 7e57\mathrm{e}-5 6e96\mathrm{e}-9 7e57\mathrm{e}-5 5e95\mathrm{e}-9 134 2e52\mathrm{e}-5 4e14\mathrm{e}-1 4e104\mathrm{e}-10 3e53\mathrm{e}-5
135 7e57\mathrm{e}-5 5e95\mathrm{e}-9 6e56\mathrm{e}-5 4e94\mathrm{e}-9 135 3e53\mathrm{e}-5 6e16\mathrm{e}-1 1e91\mathrm{e}-9 1e51\mathrm{e}-5
Table 5: Table 5. Newton-Minpert applied to the driven cavity problem.
kk Fk\left\|F_{k}\right\| λk,50MP\sqrt{\lambda_{k,50}^{\mathrm{MP}}} Fk2\left\|F_{k}\right\|^{2} rk\left\|r_{k}\right\| bkb_{k} ckc_{k}
0 4e+04\mathrm{e}+0 6e-2 2e+12\mathrm{e}+1 9e19\mathrm{e}-1 7.0262e-2 7.8592e-2
1 9e+09\mathrm{e}+0 3e13\mathrm{e}-1 9e+19\mathrm{e}+1 1e+0 3.3821e13.3821\mathrm{e}-1 3.5336e13.5336\mathrm{e}-1
2 6e+0 3e13\mathrm{e}-1 3e+13\mathrm{e}+1 1e+0 8.0633e18.0633\mathrm{e}-1 8.2988e18.2988\mathrm{e}-1
3 1e+0 2e12\mathrm{e}-1 3e+0 8e18\mathrm{e}-1 1.1452e+01.1452\mathrm{e}+0 1.1887e+01.1887\mathrm{e}+0
4 8e18\mathrm{e}-1 2e12\mathrm{e}-1 7e17\mathrm{e}-1 7e17\mathrm{e}-1 1.6037e+0 1.6438e+01.6438\mathrm{e}+0
131 1e51\mathrm{e}-5 1e51\mathrm{e}-5 2e102\mathrm{e}-10 1e-5 4.0286e84.0286\mathrm{e}-8 4.0292e84.0292\mathrm{e}-8
132 1e-5 1e51\mathrm{e}-5 2e102\mathrm{e}-10 1e-5 3.2414e93.2414\mathrm{e}-9 3.2489e93.2489\mathrm{e}-9
133 1e-5 1e-5 2e102\mathrm{e}-10 1e-5 3.6246e83.6246\mathrm{e}-8 3.6241e83.6241\mathrm{e}-8
134 1e-5 1e51\mathrm{e}-5 1e101\mathrm{e}-10 1e-5 2.4452e92.4452\mathrm{e}-9 2.4576e92.4576\mathrm{e}-9
135 1e-5 1e-5 1e-10 1e-5 2.1203e82.1203\mathrm{e}-8 2.1201e82.1201\mathrm{e}-8

In Fig. 1, we have plotted on a semilog scale the evolution of Fk\left\|F_{k}\right\| for the three methods. The first 150 steps were considered. Notice the monotone behavior of Newton-Gmres and Newton-Minpert starting from certain steps, and also notice the nonmonotone behavior of Newton-Gmback.

Refer to caption
Figure 1: Fig. 1. Newton-Gmres (dash-dot line), Newton-Gmback (solid line), and NewtonMinpert (dotted line) applied to the driven cavity problem.

6. Conclusions

The local superlinear convergence and local convergence with orders 1+p,p(0,1]1+p,p\in(0,1], of the Newton methods were characterized in a more natural setting, which assumes that the Jacobians are perturbed, the function evaluations are performed approximately for FF, and the resulting linear systems are solved inexactly. The results of Dembo, Eisenstat, and Steihaug allowed us to extend their local convergence analysis to this setting.

The sufficient conditions for local convergence with different convergence orders show that the perturbations in the Jacobians may be allowed to have much greater magnitudes than those in the function evaluations and than the magnitudes of the residuals; this may be explained by the fact that the right-hand sides F(xk)-F\left(x_{k}\right) of the linear systems from the Newton method tend to zero, while the matrices F(xk)F^{\prime}\left(x_{k}\right) tend to F(x)F^{\prime}\left(x^{*}\right), assumed here to be nonsingular. It follows that the methods for solving the linear systems from the Newton methods must be analyzed in this respect and also that special care must be taken when the function evaluations may be affected by significant errors. The greater sensitivity of the right-hand sides F(xk)-F\left(x_{k}\right) has been known for a longer time (see for example Refs. 11, 21, and references therein) such that the evaluation of these vectors in double precision or extended precision is already a standard.

The existing results for the (INM) iterates allowed us to show that the Newton-Gmres iterates have a monotone property. We were able to prove that the Newton-Minpert iterates share this property only under some additional conditions, whereas from a concrete example we saw that Newton-Gmback iterates do not generally share this property. The convergence orders of the Newton-Gmback and Newton-Minpert methods can be controlled in terms of the magnitude of the backward errors of the approximate steps. The theoretical results were confronted with the performed numerical examples.

Many of the existing results for different Newton-type methods may be reconsidered in the (IPNM) setting. For example, the local convergence orders of some finite-difference Newton-Krylov methods studied by Brown (Ref. 26) now may be analyzed in the (IPNM) setting. On the other hand, new approaches can be developed as well; for instance, we should mention the local convergence and acceleration techniques for the (IPNM) in the case of singular Jacobian at the solution. Other results and directions of research are included in our PhD Thesis (Ref. 44).

References

  1. 1.

    Dembo, R. S., Eisenstat, S. C., and Steihaug, T., Inexact Newton Methods, SIAM Journal on Numerical Analysis, Vol. 19, pp. 400-408, 1982.

  2. 2.

    Ortega, J. M., and Rheinboldt, W. C., Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, NY, 1970.

  3. 3.

    Kantorovich, L. V., and Akilov, G. P., Functional Analysis, Editura Ştiințificǎ şi Enciclopedicǎ, Bucharest, Romania, 1986 (in Romanian).

  4. 4.

    Pãvăloiu, I., Introduction to the Theory of Approximating the Solutions of Equations, Editura Dacia, Cluj-Napoca, Romania, 1976 (in Romanian).

  5. 5.

    Mâruşer, Şt., Numerical Methods for Solving Nonlinear Equations, Editura Tehnicǎ, Bucharest, Romania, 1981 (in Romanian).

  6. 6.

    Dennis, J. E., Jr., and Schnabel, R. B., Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Series in Computational Mathematics, Prentice-Hall, Englewood Cliffs, New Jersey, 1983.

  7. 7.

    Potra, F. A., and Pták, V., Nondiscrete Induction and Iterative Processes, Pitman, London, England, 1984.

  8. 8.

    Argyros, I., and Szidarovszky, F., The Theory and Applications of Iteration Methods, CRC Press, Boca Raton, Florida, 1993.

  9. 9.

    Kelley, C. T., Iterative Methods for Linear and Nonlinear Equations, SIAM, Philadelphia, Pennsylvania, 1995.

  10. 10.

    Rheinboldt, W. C., Methods for Solving Systems of Nonlinear Equations, SIAM, Philadelphia, Pennsylvania, 1998.

  11. 11.

    Dennis, J. E., Jr., and Walker, H. F., Inaccuracy in Quasi-Newton Methods: Local Improvement Theorems, Mathematical Programming Study, Vol. 22, pp. 70-85, 1984.

  12. 12.

    Potra, F. A., and Pták, V., Sharp Error Bounds for Newton’s Process, Numerische Mathematik, Vol. 34, pp. 63-72, 1980.

  13. 13.

    Deuflhard, P., and Potra, F. A., Asymptotic Mesh Independence of NewtonGalerkin Methods via a Refined Mysovskii Theorem, SIAM Journal on Numerical Analysis, Vol. 29, pp. 1395-1412, 1992.

  14. 14.

    Argyros,, I., Concerning the Convergence of Inexact Newton Methods, Journal of Computational and Applied Mathematics, Vol. 79, pp. 235-247, 1997.

  15. 15.

    Argyros, I., On a New Newton-Mysovskii-Type Theorem with Applications to Inexact Newton-Like Methods and Their Discretization, IMA Journal of Numerical Analysis, Vol. 18, pp. 37-56, 1998.

  16. 16.

    Sherman, A. H., On Newton-Iterative Methods for the Solution of Systems of Nonlinear Equations, SIAM Journal on Numerical Analysis, Vol. 15, pp. 755771, 1978.

  17. 17.

    Dennis, J. E., Jr., and Moré, J. J., A Characterization of Superlinear Convergence and Its Application to Quasi-Newton Methods, Mathematics of Computation, Vol. 28, pp. 549-560, 1974.

  18. 18.

    Dennis, J. E., Jr., and Moré, J. J., Quasi-Newton Methods, Motivation and Theory, SIAM Review, Vol. 19, pp. 46-89, 1977.

  19. 19.

    Eisenstat, S. C., and Walker, H. F., Choosing the Forcing Terms in an Inexact Newton Method, SIAM Journal on Statistical and Scientific Computing, Vol. 17, pp. 16-32, 1996.

  20. 20.

    Eisenstat, S. C., and Walker, H. F., Globally Convergent Inexact Newton Methods, SIAM Journal on Optimization, Vol. 4, pp. 393-422, 1994.

  21. 21.

    Ypma, T. J., The Effect of Rounding Errors on Newton-Like Methods, IMA Journal of Numerical Analysis, Vol. 3, pp. 109-118, 1983.

  22. 22.

    Martinez, H. J., Parada, Z., and Tapia, R. A., On the Characterization of QQ-Superlinear Convergence of Quasi-Newton Interior-Point Methods for Nonlinear Programming, Boletín de la Sociedad Matematica Mexicana, Vol. 1, pp. 137-148, 1995.

  23. 23.

    Ypma, T. J., Local Convergence of Inexact Newton Methods, SIAM Journal on Numerical Analysis, Vol. 21, pp. 583-590, 1983.

  24. 24.

    Cores, D., and Tapia, R. A., Perturbation Lemma for the Newton Method with Application to the SQP Newton Method, Journal of Optimization Theory and Applications, Vol. 97, pp. 271-280, 1998.

  25. 25.

    Potra, F. A., On Q-Order and R-Order of Convergence, Journal of Optimization Theory and Applications, Vol. 63, pp. 415-431, 1989.

  26. 26.

    Brown, P. N., A Local Convergence Theory for Combined Inexact-Newton/ Finite-Difference Projection Methods, SIAM Journal on Numerical Analysis, Vol. 24, pp. 407-434, 1987.

  27. 27.

    Cătinaș, E., A Note on Inexact Secant Methods, Revue d’Analyse Numérique et de Théorie de l’Approximation, Vol. 25, pp. 33-41, 1996.

  28. 28.

    Królikowska, M., Nonlinear Mappings with an Almost Sparse Jacobian Matrix, Annales Universitatis Mariae Curie-Skłodowska, Lublin, Poland, Vol. 49A, pp. 145-157, 1995.

  29. 29.

    Rigal, J. L., and Gaches, J., On the Compatibility of a Given Solution with the Data of a Linear System, Journal of the Association for Computing Machinery, Vol. 14, pp. 543-548, 1967.

  30. 30.

    Saad, Y., and Schultz, M. H., GMRES: A Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems, SIAM Journal on Scientific and Statistical Computing, Vol. 7, pp. 856-869, 1986.

  31. 31.

    Saad, Y., Krylov Subspace Methods for Solving Large Unsymmetric Linear Systems, Mathematics of Computation, Vol. 37, pp. 105-126, 1981.

  32. 32.

    Kasenally, E. M., GMBACK: A Generalized Minimum Backward Error Algorithm for Nonsymmetric Linear Systems, SIAM Journal on Statistical and Scientific Computing, Vol. 16, pp. 698-719, 1995.

  33. 33.

    Saad, Y., Iterative Methods for Sparse Linear Systems, PWS Publishing Company, Boston, Massachusetts, 1996.

  34. 34.

    Brown, P. N., and Saad, Y., Hybrid Krylov Methods for Nonlinear Systems of Equations, SIAM Journal on Scientific and Statistical Computing, Vol. 11, pp. 450-481, 1990.

  35. 35.

    Turner, K., and Walker, H. F., Efficient High Accuracy Solutions with GMRES(m), SIAM Journal on Scientific and Statistical Computing, Vol. 13, pp. 815-825, 1992.

  36. 36.

    Brown, P. N., and Hindmarsh, A. C., Reduced Storage Matrix Methods in Stiff ODE Systems, Applied Mathematics and Computation, Vol. 31, pp. 4091, 1989.

  37. 37.

    Greenbaum, A., Pták, V., and Strakoš, Z., Any Nonincreasing Convergence Curve is Possible for GMRES, SIAM Journal on Matrix Analysis and Applications, Vol. 17, pp. 465-469, 1996.

  38. 38.

    Brown, P. N., and Saad, Y., Convergence Theory of Nonlinear Newton-Krylov Algorithms, SIAM Journal on Optimization, Vol. 4, pp. 297-330, 1994.

  39. 39.

    Brown, P. N., A Theoretical Comparison of the Arnoldi and GMRES Algorithms, SIAM Journal on Scientific and Statistical Computing, Vol. 12, pp. 5878, 1991.

  40. 40.

    Kasenally, E. M., and Simoncini, V., Analysis of a Minimum Perturbation Algorithm for Nonsymmetric Linear Systems, SIAM Journal on Numerical Analysis, Vol. 34, pp. 48-66, 1997.

  41. 41.

    Stewart, G. W., and Sun, J. G., Matrix Perturbation Theory, Academic Press, New York, NY, 1990.

  42. 42.

    Moré, J. J., A Collection of Nonlinear Model Problems, Computational Solutions of Nonlinear Systems of Equations, Lectures in Applied Mathematics, Edited by E. L. Allgower and K. Georg, American Mathematical Society, Providence, Rhode Island, Vol. 26, pp. 723-762, 1990.

  43. 43.

    Glowinsky, R., Keller, H. B., and Reinhart, L., Continuation-Conjugate Gradient Methods for the Least-Squares Solution of Nonlinear Boundary-Value Problems, SIAM Journal on Scientific and Statistical Computing, Vol. 6, pp. 793-832, 1985.

  44. 44.

    Cătinaş, E., Newton and Newton-Krylov Methods for Solving Nonlinear Systems in n,PhD\mathbb{R}^{n},\mathrm{PhD} Thesis, Babeş-Bolyai University of Cluj-Napoca, Cluj-Napoca, Romania, 1999.

2001

Related Posts