Return to Article Details Relationship between the inexact Newton method and the continuous analogy of Newton's method

Relationship between the inexact Newton method and the continuous analogy of Newton’s method

T. Zhanlav, O. Chuluunbaatar§ G. Ankhbayar

January 11, 2012

School of Mathematics and Computer Science, National University of Mongolia, Ulan-Bator, Mongolia, e-mail: tzhanlav@yahoo.com, chuka@jinr.ru, anxaa_w@yahoo.com.

§Joint Institute for Nuclear Research, Dubna, 141980 Moscow region, Russia.

In this paper we propose two new strategies to determine the forcing terms that allow one to improve the efficiency and robustness of the inexact Newton method. The choices are based on the relationship between the inexact Newton method and the continuous analogy of Newton’s method. With the new forcing terms, the inexact Newton method is locally Q-superlinearly and quadratically convergent. Numerical results are presented to support the effectiveness of the new forcing terms.

MSC. 97N40

Keywords. inexact Newton method, continuous analogy of Newton’s method

1 Introduction

Consider a nonlinear system

F(x)=0,\eqno(1)

where F:DRnRn is a continuously differentiable nonlinear mapping. Among all kinds of methods for solving the nonlinear equations (1), the Newton method is perhaps the most elementary, popular and important [ 1 , 2 , 3 , 4 , 5 , 6 ] . One of the advantages of the method is its local quadratic convergence. However, its computational cost is expensive, particularly when the size of the problem is very large, because the Newton equations

F(xk)+F(xk)sk=0\eqno(2)

should be solved at each iteration step. To reduce the computational cost of the Newton method, Dembo, Eisenstat and Steihaug [ 7 ] proposed an inexact Newton (IN) method,

F(xk)s¯k=F(xk)+rk,
xk+1=xk+s¯k,k=0,1,,x0D.\eqno(3)

The terms rkRn represent the residuals of the approximate solutions s¯k of the Newton equation (2), i.e., inexactly solve the Newton equation (2) and obtain a step s¯k such that

rk=F(xk)+F(xk)s¯kηkF(xk),\eqno(4)

where ηk[0,1) is the forcing term. In each iteration step of the IN method, a real number ηk should be chosen first, and then an IN step s¯k is obtained by solving the Newton equations approximately with an efficient iteration solver for systems of linear equations. The forcing terms play an important role both in reducing the residuals of Newton equations and in increasing accuracy of the method. In particular, if ηk=0 for all k, then the IN method reduces to the Newton method. The IN method, like the Newton method, is locally convergent.

Theorem 1.1

[ 7 ] . Assume that the IN iterates converge to x. Then the convergence is superlinear if and only if

rk=o(F(xk))  as  k.\eqno(5)
Theorem 1.2

[ 7 ] . Given ηkη<t<1, k=0,1, there exists ε>0 such that for any initial approximation x0 with x0xε, the sequence of the IN iteration xk satisfying (4) converges to x. Moreover, the convergence is linear in the sense that

xk+1xtxkx, k=0,1,,\eqno(6)

where y=F(x)y.

Theorem 1.3

[ 7 ] . Assume that F:RnRn is continuously differentiable, xRn such that F(x)=0 and F(x) is nonsingular. If the sequence xk generated by IN iterates converges to x, then
(1) xk converges to x superlinearly when ηk0;
(2) xk converges to x quadratically if ηk=O(F(xk)) and F(x) is Lipschitz continuous at x. In [ 8 ] the condition

rkF(xk)+s¯k=O(F(xk))  as  k\eqno(7)

is proposed which characterizes the quadratic convergence of the IN iterations.

In [ 8 ] it is shown that the IN, inexact perturbed and quasi-Newton methods are equivalent models. At present there are some strategies for choosing the forcing terms [ 9 , 10 , 11 ] .

2 The continuous analogy of Newton’s method (CANM)

One of the modifications of Newton method is the well known CANM or damped Newton method [ 1 , 4 , 5 , 6 ]

F(xk)vk=F(xk),  xk+1=xk+τkvk,  k=0,1,,\eqno(8)

where τk>0 is an iteration parameter. The suitable choice of the parameter allows us to speed-up the convergence and to enlarge the convergence domain. If τk1 for all k, then CANM reduces to Newton method. There exists a closed relationship between the CANM and the IN iterates. Indeed, the CANM (8) can be considered as the IN iteration (3) with the residual

rk=F(xk)s¯k+F(xk)=(1τk)F(xk).\eqno(9)

From here we get

rk=ηkF(xk)\eqno(10)

with

ηk=|1τk|.\eqno(11)

We have a local and semi-local convergence results:

Theorem 2.1

There exits ε>0 such that for any initial approximation x0 with x0xε, the sequence generated by (8) with parameter τk(0,2) converges to x.

Proof â–¼
As mentioned above, the CANM is equivalent to IN iterates (9), for which the residual satisfies inequality (4) with forcing term ηk[0,1) under condition τk(0,2). Then by theorem 1.2 the sequence xk generated by (3) or by (8) converges to x.
Proof â–¼

Let D0 be a convex set with D0D, and (F(x))1 exists for all xD0.

Theorem 2.2

We assume that

  • F(x)M, xD0,

  • F(x0)1β,

  • F(x0)1F(x0)η, a0=Mβη,

  • MF(xk)1F(xk)1F(xk)ak<2,  k=0,1,

and

0<τk<1+1+4akak.\eqno(12)

Then the sequence {xn} defined by (8) and starting at x0D0 converges to a solution x of (1).

Proof â–¼
The Taylor expansion gives
F(xk+1)=(1τk)F(xk)+F(ξk)2τk2(F(xk)1F(xk))2,\eqno(13)

where ξk=θxk+(1θ)xk+1, θ(0,1). If we use the assumption (iv), then from (13) it follows that

F(xk+1)<(|1τk|+ak2τk2)F(xk)<F(xk)

under condition (12), i.e., {F(xk)} is a decreasing sequence provided the iteration parameter τk is chosen in the interval (0,1+1+4akak).

Proof â–¼

Remark 2.1

It is easy to show that

1<1+1+4akak<2,   when  0<an<2.

This means that τk(0,2).â–¡

Remark 2.2

Assume that the sequence xk generated by CANM converges to x. Then

  • {xk} converges to x superlinearly when τ1,

  • {xk} converges to x quadratically if

|1τk|=O(F(xk))  or  |1τk|=O(F(xk1)),\eqno(14)

because of theorem 1.3.â–¡

3 Some choices of iteration parameters

From (13) it follows that

F(xk)|1τk1|F(xk1)F(xk1)=O(F(xk1))

and

F(xk)|1τk1|F(xk1)F(xk)=O(F(xk1))

Then by Remark 2.2, one can choose τk, such that

|1τk|=|F(xk)|1τk1||F(xk1)F(xk1)\eqno(16a)

or

|1τk|=|F(xk)|1τk1|F(xk1)|F(xk),\eqno(16b)

which allows us the quadratic convergence of CANM. The formulas (16) can be rewritten in term of the forcing term ηk as

ηk=|αkηk11|αk\eqno(17a)

and respectively

ηk=|αkηk11|,\eqno(17b)

where

αk=F(xk1)F(xk).\eqno(18)

Assume that

F(xk)ηk1F(xk1),  0ηk1<1.\eqno(19)

Then αk>1 and αk ηk11. As a consequence, the minimum of the possible choices (17) is

ηn=αkηk11αk.

If the inequality (19) is not true, then (17) give us ηn=1ηk1αk. Thus we have

ηk={1ηk1αk,  when  ηk1αk<1,1ηk1αkαk,  when  ηk1αk1.\eqno(20)

The second choice in (20) allows us to decrease ηk, i.e., 0<ηk<ηk1,  while the first choice in (20) implies that 0<ηk<1. In both cases we have 0<ηk<1. According to (4), it is possible that (19) is true and thereby the second choice in (20) allows us to decrease ηk, i.e., ηk0 as k. In terms of τk we have the following choice

|1τk|=ηk.\eqno(21)

From this it follows that, if 0<ηk<1 and ηk0 as k, then we have 0<τk<2 and τk1 as k. Thus, the choice of the iteration parameter given by (20), (21) can be used in CANM.

In [ 6 ] another choice for CANM was proposed:

τn=21+1+2bF(xn).\eqno(22)

According to (21), the formula (22) in term of ηn reads as

ηn=1+2bF(xn)11+2bF(xn)+1.\eqno(23)

which can be used in the IN as a forcing term. From (20) we see that the inexact Newton method with forcing term given by (20) is locally Q-superlinearly convergent, while IN with forcing term given by (23) converges quadratically because of ηn=O(F(xn)) (see theorem 1.3).

4 Numerical experiments

In this section we present numerical examples to demonstrate the efficiency of the new strategies to choose forcing terms. We compare the strategies with some known strategies on their numerical behavior.

We have used three test problems that are typical systems of nonlinear equations in literature, with each of its own name and standart initial guess, say xs. The problems are listed as follows [ 11 ] :

Problem 4.1 (Generalized function of Rosenbrock)

{f1(x)=4c(x2x12)x12(1x1),fi(x)=2c(xixi12)4c(xi+1xi2)xi2(1xi),  i=2,3,,n1,fn(x)=2c(xnxn12),

with c=2 and xs=(1.2,1.2,...,1.2)T.

Problem 4.2 (Tridiagonal system)

{f1(x)=4(x1x22),fi(x)=8xi(xi2xi1)2(1xi)+4(xixi+12),  i=2,3,,n1,fn(x)=8xn(xn2xn1)2(1xn),

with xs=(12,12,...,12)T.

Problem 4.3 (Five-diagonal system)

{f1(x)=4(x1x22)+x2x32,f2(x)=8x2(x22x1)2(1x2)+4(x2x32)+x3x42,fi(x)=8xi(xi2xi1)2(1xi)+4(xixi+12)+xi12xi2+xi+1xi+22,i=3,,n2,fn1(x)=8xn1(xn12xn2)2(1xn1)+4(xn1xn2)+xn22xn3,fn(x)=8xn(xn2xn1)2(1xn)+xn12xn2,

with xs=(2,2,...,2)T.

Table 1 Results for problems 4.1–4.3.

Choice

 

GF of Rosenbrock

Tridiagonal system

Five-diagonal system

   

xs

3xs

3xs

xs

2xs

2xs

xs

2xs

2xs

EW1

NI

6

17

18

24

23

27

41

19

16

EW1

GI

60

165

122

193

203

251

386

121

125

EW1

CT

1.91

5.66

2.19

6.21

7.15

7.81

12.88

5.11

3.17

EW2

NI

6

16

35

31

40

43

19

47

17

EW2

GI

85

110

129

147

206

167

94

156

85

EW2

CT

2.47

3.91

3.81

5.85

7.21

6.25

3.66

5.21

2.21

APR

NI

9

11

21

12

20

46

15

17

16

APR

GI

81

93

116

88

128

216

97

98

95

APR

CT

2.66

3.13

4.94

3.25

4.35

8.28

4.03

2.58

2.02

Str (20)

NI

10

19

35

33

31

42

33

32

19

Str (20)

GI

81

105

140

141

128

145

128

116

88

Str (20)

CT

2.50

3.88

5.22

5.37

5.17

6.16

6.38

3.30

3.00

Str (23)

NI

6

14

31

19

28

30

25

17

12

Str (23)

GI

64

87

115

102

135

127

126

91

70

Str (23)

CT

1.94

3.19

4.65

3.69

4.96

3.34

5.78

2.11

1.91

We test the case n=100. Besides the standard initial guess xs, also test other initial guesses such as x0=±jxs, j=2,3. It is easy to see that e=(1,1,...,1)T is a solution to each of the above three problems. For the convenience of reporting the results of different forcing terms, the following notations are used in Table 1:

EW1 : the first strategy given by Eisenstat and Walker [ 10 ] ,

EW2 : the second strategy given by Eisenstat and Walker [ 10 ] ,

APR : the actual reduction and the predicted reduction strategy given in [ 11 ] as well as strategies defined by formulas (20) and (23),

Str (20) : strategy given by (20),

Str (23) : strategy given by (23),

NI : denotes the total number of nonlinear iterations,

GI : denotes the total number of GMRES iterations,

CT : denotes 102×(CPU time).

The norm in our test is the Euclidean norm 2 and the stoping criteria is

F(xk)ε=1012.

The initial forcing term η0=0.5 and additional parameters of ηk are chosen following [ 11 ] , for all the above iterations EW1, EW2, APR, Strategies (20) and (23).

The comparison of GMRES with different forcing terms was made by CPU time. From Table 1 we see that GMRES with the forcing term given by (23) is better than those with other forcing terms.

From Table 2 we see that the forcing term given by (23) decreases more rapidly than the forcing term APR that halved at each iteration. As results, the residual norm F(xk) in the first case decreases as rapidly as the forcing term.

Table 2 Iteration comparison between APR and new forcing terms on Rosenbrock system with initial guess xs, tolerance ε=1014 and parameter b=0.1 in strategy (23).
 

Str (23)

APR

k

GI

F(xk)

ηk

GI

F(xk)

ηk

0

1

17.5015

5.0000101

1

17.5015

5.0000101

1

3

4.4680

1.5828101

2

4.4680

2.5000101

3

9

4.9646101

2.3662102

4

1.1195

1.2500101

4

11

1.0066101

4.9831103

8

1.0501101

6.250102

5

18

5.4711104

2.7354105

8

7.589102

6.250102

6

27

1.5473107

7.7363109

9

3.6494102

3.1250102

7

24

1.42231015

5.23401013

11

1.0827104

1.5625102

8

-

-

-

12

8.2737107

7.8125103

9

-

-

-

13

2.7747109

3.9063103

10

-

-

-

13

4.18061012

1.9531103

11

-

-

-

13

1.16781014

9.7656104

Acknowledgement

O. Chuluunbaatar acknowledges a financial support from the RFBR Grant No. 11-01-00523, and the theme 09-6-1060-2005/2013 “Mathematical support of experimental and theoretical studies conducted by JINR”.

This work was partially sponsored by Foundation for Science and Technology of Ministry of Education, Culture and Science (Mongolia).

Bibliography

1

J. Stoer and R. Bulirsch, Introduction to numerical analysis, Third edition, Springer, 2002.

2

E. Zeidler, Nonlinear functional analysis and its applications, part 1: Fixed-point theorems, Springer-Verlag, 1986.

3

L.V. Kantorovich, Functional analysis and applied mathematics, Uspekhi Mat. Nauk, 3:6(28), pp. 89–185, 1948, (in Russian).

4

M.K. Gavurin, Nonlinear functional equations and continuous analogues of iterative methods, Izv. Vyssh. Uchebn. Zaved. Mat., 5(6), pp. 18–31, 1958.

5

T. Zhanlav and I.V. Puzynin, The convergence of iterations based on the continuous analogy of Newton’s method, Comput. Maths. Math. Phys., 32, pp. 846–856, 1992. (in Russian)

6

T. Zhanlav and O. Chuluunbaatar, A local and semilocal convergence of the continuous analogy of Newton’s method for solving nonlinear equations, Bulletin of PFUR Series Mathematics. Information Sciences. Physics, 1, pp. 34–43, 2012.

7

R.S. Dembo, S.C. Eisenstat and T. Steihang, Inexact Newton methods, SIAM J. Numer. Anal., 19, pp. 400–408, 1982.

8

E. Cătinaş, The inexact, inexact perturbed, and quasi-Newton methods are equivalent models, Math. Comp., 74, pp. 291–301, 2004.

9

X.-C. Cai, W.D. Gropp, D.E. Keyes and M.D. Tidriti, Newton-Krylov-Schwarz methods in CFD, Proceeding of the international workshop on numerical methods for the Navier-stokes equations, Vieweg, Braunschwieg, pp. 17–30, 1995.

10

S.C. Eisenstat and M.F. Walker, Choosing the forcing term in an inexact Newton method, SIAM J. Sci. Comput., 17, pp. 16–32, 1996.

11

H.-B. An, Z.-Y. Mo and X.-P. Liu, A choice of forcing terms in inexact Newton method, J. Comput. Appl. Math., 200, pp. 47–60, 2007.