FINDING GOOD STARTING POINTS FOR SOLVING EQUATIONS BY NEWTON’S METHOD

. We study the problem of ﬁnding good starting points for the semilo-cal convergence of Newton’s method to a locally unique solution of an operator equation in a Banach space setting. Using a weakened version of the Newton– Kantorovich theorem we show that the procedure suggested by Kung [6] is improved in the sense that the number of Newton-steps required to compute a good starting point can be signiﬁcantly reduced (under the same computational cost required in the Newton–Kantorovich theorem [3], [5]).


INTRODUCTION
In this study we are concerned with the problem of approximating a locally unique solution x * of equation (1.1) where F is a Fréchet-differentiable operator defined on an open convex subset D of a Banach space X with values in a Banach space Y .A large number of problems in applied mathematics and also in engineering are solved by finding the solutions of certain equations.For example, dynamic systems are mathematically modeled by difference or differential equations, and their solutions usually represent the states of the systems.For the sake of simplicity, assume that a time-invariant system is driven by the equation ẋ = Q(x) (for some suitable operator Q), where x is the state.Then the equilibrium states are determined by solving equation (1.1).Similar equations are used in the case of discrete systems.The unknowns of engineering equations can be functions (difference, differential, and integral equations), vectors (systems of linear or nonlinear algebraic equations), or real or complex numbers (single algebraic equations with single unknowns).Except in special cases, the most commonly used solution methods are iterative -when starting from one or several initial approximations a sequence is constructed that converges to a solution of the equation.Iteration methods are also applied for solving optimization problems.In such cases, the iteration sequences converge to an optimal solution of the problem at hand.Since all of these methods have the same recursive structure, they can be introduced and discussed in a general framework.
The most popular method for generating a sequence approximation x * is undoubtedly Newton's method: Here F (x) is the Fréchet derivative of F which is a bounded linear operator from X into Y .A survey of local and semilocal convergence results for Newton's method (1.2) can be found in [1], [3], [5] and the references there.
In particular the famous Newton-Kantorovich theorem guarantees the quadratic convergence of method (1.2) if the initial guess x 0 is "close enough" to the solution x * [5] (see (2.11)).
However we recently showed that the Newton-Kantorovich hypothesis (2.11) can always be replaced by the weaker (2.5) (under the same computational cost) [1, p. 387, Case 3, for δ = δ 0 ].In particular using the algorithm proposed by H.T. Kung [6] (see also [7]) we show that the number of steps required to compute a good starting point x 0 (to be precised later) can be significantly reduced.

FINDING GOOD STARTING POINTS FOR NEWTON'S METHOD (1.2)
We recently showed the following semilocal convergence theorem for Newton's method (1.2) which essentially states the following [1]: If: (2.1) ) then sequence {x n } (n ≥ 0) generated by Newton's method (1.2) is well defined, remains in U (x 0 , r) for all n ≥ 0 and converges quadratically to a unique solution x * ∈ U (x 0 , r) of equation F (x) = 0. Moreover we have (2.9) If equality holds in (2.10) then the result stated above reduces to the famous Newton-Kantorovich theorem and (2.5) to the Newton-Kantorovich hypothesis (2.11) [5].If strict inequality holds in (2.10) then (2.5) is weaker than the Newton-Kantorovich hypothesis (2.11) Moreover the error bounds on the distances x n+1 − x n , x n − x * (n ≥ 0) are finer and the information on the location of the solution more precise [1].In [1] examples were also given to show that (2.11) is violated but (2.5) holds.Furthermore as the following example demonstrates K K 0 can be arbitrarily large: where c i , i = 0, 1, 2, 3 are given parameters.Using (2.3), (2.4) and (2.12) we can easily see that for c 3 large and c 2 sufficiently small K K 0 can be arbitrarily large.
That is (2.5) may hold but not (2.11).Note also that the computational cost of obtaining (K 0 , K) is the same as the one for K since in practice evaluating K requires finding K 0 .Hence all results using (2.11) instead of (2.5) can now be challenged to obtain more information.That is exactly what we are doing here.In particular motivated by the elegant work of H.T. Kung [6] on good starting points for Newton's method we show how to improve on these results if we use our theorem stated above instead of the Newton-Kantorovich theorem.Definition 1.We say x 0 is a good starting point for approximating x * by Newton's method or a good starting point for short if conditions (2.1)-(2.8)hold.
Note that the existence of a good starting point implies the existence of a solution x * of equation F (x) = 0 in U (x 0 , a ξ 0 ).
We provide the following theorem/Algorithm which improves the corresponding ones given in [6, Theorem 4.1] to obtain good starting points.
Proof.Simply use L instead of K in the proof of Theorem 4.1 in [6, p. 11] including the algorithm there which is essentially repeated here with some modifications: Algorithm A: The goal of this algorithm is to produce a good starting point (under the conditions of Theorem 1) for approximating x * .
(3) Set λ i ← 1 2 − δ /h i , and (4) (It is shown in the proof that x i is a good starting point for approximating a zero, denoted by x i+1 , of F i .)Apply Newton's method to F i , starting from x i , to find x i+1 .(5) (Assume that the exact x i+1 is found.)Set η i+1 ← F (x i+1 ) and (6) Set i ← i + 1, and return back to Step 2.
That completes the proof of Theorem 1.
Remark 2. As already noted in [6] Theorem 1 is trivial for the scalar case (f : R → R), since the mean value theorem can be used.Some of the assumptions of Theorem 1 can be weakened.Avila for example in [4, Theorem 4.3] instead of (2.14) used a more complicated condition involving β, K, and η 0 .However the idea of his algorithm is basically different from Algorithm A. Note also that if K 0 = K then our Theorem 1 reduces to Theorem 4.1 in [6].We now modify Algorithm A to make it work in Banach spaces without necessarily assuming that the exact zero of x i+1 of F i can be found using Newton's method (1.2).
Theorem 2. Under the hypotheses of Theorem 1 a good starting point for approximating solution x * of equation F (x) = 0 can be obtained in N (δ, K 0 , K) Newton steps, δ is any number in 0, 1  2 , (2.15) where, I(δ, K 0 , K) is the smallest integer i such that: and J(δ, K 0 , K) is the smallest integer j such that: (2.17) 2 and the following algorithm.
Algorithm B.
) Apply Newton's method to F i , starting from x i , to find an approximation x i+1 to a zero x i+1 of F i such that That completes the proof of Theorem 2.
Remark 3. As noted in [6] δ should not be chosen to minimize the complexity of Algorithm B. Instead, δ should be chosen to minimize the complexity of algorithm: (1) Search Phase: Perform Algorithm B.
(2) Iteration Phase: Perform Newton's method starting from the point obtained by Algorithm B.
An upper bound on the complexity of the iteration phase is the time needed to carry out T (δ, K 0 , K, ε) Newton steps, where T (δ, K 0 , K, ε) is the smallest integer k such that (2.20) Note also that if K 0 = K our Theorem 2 reduces to Theorem 4.2 in [6, p. 15].
Hence we showed the following result: Theorem 3.Under the hypotheses of Theorem 2 the time needed to find a solution x * of equation F (x) = 0 inside a ball of radius ε is bounded above by the time needed to carry out R(δ, K 0 , K, ε) Newton steps, where where N (δ, K 0 , K, ε) and T (δ, K 0 , K, ε) are given by (2.15) and (2.20) respectively.
Remark 4. If K 0 = K Theorem 3 reduces to Theorem 4.3 in [6, p. 20].In order for us to compare our results with the corresponding ones in [6] we computed the values of R(ε, K 0 , K) for F satisfying the conditions of Theorem 4.3 in [6] and Theorem 3 above with (2.22) β η 0 ≤ .4r,
The following table gives the results for ε = 10 −6 r.Note that by I we mean I(δ 0 , K, K), I αK we mean I(δ 0 , K 0 , K) with K 0 = αK, α ∈ [0, 1].Similarly for J, N and T .Note also that the columns J .9K , J .5KJ 0K as identical to J have been omitted.The columns I, J, N , T , R were given in [6], where they did not use K 0 but K in (2.3).
The table was produced using (2.15)-(2.23): h 0 δ 0 I J N T R I .9KN .9KR .9KI .5KN .5KR .5KI 0K N 0K R 0K Remark 5.It follows from the table that our results (see columns I .qKand after) significantly improve the corresponding ones in [6] and under the same computational cost.Suppose for example that h 0 = 9, δ = .159.Kung found that the search phase can be done in 255 Newton steps and the iteration phase in 5 Newton steps.That is a root can be located inside a ball of radius 10 −6 r using 260 Newton steps.However for K 0 = .9K,K 0 = .5Kand K 0 = 0 the corresponding Newton steps are 245,179 and 104 respectively which constitute a significant improvement.At the end of his paper Kung asked whether the number of Newton steps used by this procedure is close to the minimum.It is now clear from our approach that the answer is no (in general).
Finally Kung proposed the open question: Suppose that the conditions of the Newton-Kantorovich theorem hold: Is Newton's method optimal or close to optimal, in terms of the numbers of function and derivative equations required to approximate the solution x * of equation F (x) = 0 to within a given tolerance ε?
Clearly according to our approach the answer is no (see also Remark 1).
Remark 6.The results obtained here further improve the corresponding ones given by us in [2].There, we used L = K 0 +K 2 , instead of L in (2.6).However, we have L ≤ L for all K 0 ≥ 0 and K ≥ 0.
If L = L, then the results obtained here coincide with the corresponding ones in [1] (which in turn improved the ones in [4], [6], [7]).Otherwise (i.e. if L < L), then the results in this study improve our results in [2].