A generalization of the Newton method

8 years ago

Abstract

Let \(X\) be a Banach space, \(Y\) a normed space, \(G:X\rightarrow Y\) a nonlinear operator, and \(G\left( x\right) =0\) a nonlinear equation. We denote by \(F:X^{2}\rightarrow Y\) a nonlinear operator for which the restriction to the diagonal of \(X^{2}\) coincide with \(G\). We first prove a Taylor type formula for operators with two variables. Next we consider the following two-step Newton type method: \[F\left( x_{n},x_{n-1}\right) +F_{x}^{\prime}\left( x_{n},x_{n-1}\right) \left( x_{n+1}-x_{n}\right) +F_{y}^{\prime}\left( x_{n},x_{n-1}\right) \left( x_{n}-x_{n-1}\right)=0.\] We study the convergence to the solution of the above sequence.

Authors

Ion Păvăloiu
(Tiberiu Popoviciu Institute of Numerical Analysis)

Title

Original title (in French)

Une généralisation de methode de Newton

English translation of the title

A generalization of the Newton method

Keywords

Taylor polynomial with two variables; two-step Newton type method

PDF

Scanned paper.

Cite this paper as:

I. Păvăloiu, Une généralisation de methode de Newton, Mathematica, 20(43) (1978) no. 1, pp. 45-52 (in French).

About this paper

Journal

Mathematica

Publisher Name

DOI

Not available yet.

Print ISSN

Not available yet.

Online ISSN

Not available yet.

References

[1] Kantorovici, L.V., Functionalnîi analiz i prikladnaia matematika, UMN, 28, 89-185 (1948).

[2] Pavaloiu, I., Sur les procédés iteratif à un ordere élevé de convergence. Mathematica, 12, (35), 2 309-324 (1970).

[3] Weinisckhe, J. H., Über eine Klasse von Iterationsverfahren. Numeriche Mathematik , 6, 395-404, (1964).

Paper (preprint) in HTML form

A-generalization-of-the-Newton-method

Original text

Rate this translation

Your feedback will be used to help improve Google Translate

A generalization of the Newton method

by
ION PAVALOIU
(Cluj-Napoca)
In this work we will study a Newton-type method for solving operational equations. For the construction of this method it is necessary to consider the generalized Taylor formula for applications of two variables.

Let

X

And

Y

two normed linear spaces. We denote by

X^{2} == X \times X

the Cartesian product of

X

by itself and consider an open set

U \subset X^{2}

. Suppose that the elements (

x_{0}, y_{0}

) And (

x_{0} + h, y_{0} + k

) belong to the set

U

, Or

(h, k) \in X^{2}

. Either

f : U \to Y

an application defined on

U

and values in

Y

.

THEOREM 1. If the application

f : U \to Y

admits continuous partial derivatives, in the Fréchet sense, up to order 2 inclusive, for each element

(x, y) \in U

, then we have the following inequality:
(1)

\begin{aligned} ‖ f (x_{0} + h, y_{0} + k) - f (x_{0}, y_{0}) - f_{x}^{'} (x_{0}, y_{0}) h - f_{y^{'}}^{'} (x_{0}, y_{0}) k ‖ ≦ & \begin{matrix} \leq \frac{1}{2} sup_{\begin{array}{c} 0 \leq θ \leq 1 \\ 0 \leq μ \leq 1 \end{array}} ‖ f_{x^{2}}^{''} (x_{0} + θ h, y_{0} + μ k) ‖ \cdot ‖ h ‖^{2} + \end{matrix} + \frac{1}{2} sup_{\begin{array}{c} 0 \leq μ \leq 1 \\ 0 \leq μ \leq 1 \end{array}} ‖ f_{y^{2}}^{''} (x_{0} + θ h, y_{0} + μ k) ‖ \cdot ‖ k ‖^{2} + \\ + sup_{\begin{array}{c} 0 \leq 0 \leq 1 \\ 0 \leq μ \leq 1 \end{array}} ‖ f_{x y}^{''} (x_{0} + θ h, y_{0} + μ k) ‖ \cdot ‖ h ‖ \cdot ‖ k ‖ \end{aligned}

or by

f_{x}^{'}, f_{y}^{'}, f_{x}^{''}, f_{x y}^{''}, f_{y}^{''}

we have designated the partial derivatives of order 1 and 2 of the application

f

with respect to the specified variables.

Demonstration. We will give a demonstration analogous to that given in work [1] for Taylor's formula relating to applications of a single variable.

We designate by

y

the expression

\begin{matrix} (2) & y = f (x_{0} + h, y_{0} + k) - f (x_{0}, y_{0}) - f_{x}^{'} (x_{0}, y_{0}) h - f_{y}^{'} (x_{0}, y_{0}) k \end{matrix}

Either

T : Y \to R

a linear and continuous functional defined on

Y

and values in

R

, which has the following properties:

\begin{aligned} (3) & ‖ T ‖ = 1 \\ T y = ‖ y ‖ \end{aligned}

(The existence of such a functional is ensured by the HahnBanach theorem).

Now we consider the function

φ : R \times R \to R

given by the following equality:

\begin{matrix} (4) & φ (α, β) = T (f (x_{0} + α h, y_{0} + β k)) . \end{matrix}

For the partial derivatives of the function

φ

we have the following formulas:

\begin{aligned} φ_{α}^{'} (α, β) = T [f_{x}^{'} (x_{0} + α h, y_{0} + β k) h]; \\ φ_{β}^{'} (α, β) = T [f_{y}^{'} (x_{0} + α h, y_{0} + β k) k]; \end{aligned}

(5)

\begin{aligned} φ_{α^{2}}^{''} (α, β) = T [f_{x^{2}}^{''} (x_{0} + α h, y_{0} + β k) h^{2}]; \\ φ_{α β}^{''} (α, β) = T [f_{x y}^{''} (x_{0} + α h, y_{0} + β k) h k]; \\ φ_{β^{'}}^{''} (α, β) = T [f_{y_{0}^{'}}^{''} (x_{0} + α h, y_{0} + β k) k^{2}] . \end{aligned}

The values of the function

φ

and of these first-order partial derivatives at the point (0,0) are given by the following formulas:

\begin{matrix} φ (0, 0) = T [f (x_{0}, y_{0})] \\ (6) & φ_{α}^{'} (0, 0) = T [f_{x}^{'} (x_{0}, y_{0}) h] \\ φ_{β}^{'} (0, 0) = T [f_{y}^{'}, (x_{0}, y_{0}) k] \end{matrix}

For the function

φ

we have:

\begin{aligned} (7) & φ (1, 1) - φ (0, 0) - φ_{α}^{'} (0, 0) - φ_{β}^{'} (0, 0) = \\ = \frac{1}{2} [φ_{α^{2}}^{''} (θ, μ) + 2 φ_{α β}^{''} (θ, μ) + φ_{β^{2}}^{''} (θ, μ)] \end{aligned}

Or

0 ≦ θ ≦ 1, 0 ≦ μ ≦ 1

.

Taking into account formulas (2) - (7) we have:
(8)

\begin{matrix} ‖ y ‖ = T y = φ (1, 1) - φ (0, 0) - φ_{α}^{'} (0, 0) - φ_{β}^{'} (0, 0) = \\ = \frac{1}{2} T [f_{x^{2}}^{''} (x_{0} + θ h, y_{0} + μ k) h^{2} + 2 f_{x y}^{''} (x_{0} + θ h, y_{0} + μ k) h k + \end{matrix}

\begin{matrix} ≦ \frac{1}{2} ‖ T ‖ [sup ‖ f_{x^{2}}^{''} (x_{0} + θ h, y_{0} + μ k) ‖ \cdot ‖ h ‖^{2} + \\ + 2 sup ‖ f_{x y}^{''} (x_{0} + θ h, y_{0} + μ k) ‖ \cdot ‖ h ‖ \cdot ‖ k ‖ + \\ + sup ‖ f_{y^{2}}^{''} (x_{0} + θ h, y_{0} + μ k)) ‖ \cdot ‖ k ‖^{2}], \end{matrix}

Or

0 ≦ θ ≦ 1, 0 ≦ μ ≦ 1

.
From (2), (3) and (8) the inequality of the statement of the theorem results.
In what follows we consider the following equation
(9)

G (x) = 0

Or

G : X \to Y

And

θ

is the neutral element of space

Y

We assume that

X

is a Banach space. We denote by

F : X^{2} \to Y

an application and we assume that the application restriction

F

on the whole

D

coincides with

G

, Or

D = {(x, y) \in X^{2} : x = y}

.

Either

(x_{n}, x_{n - 1}) \in X^{2}

any element of space

X^{2}

. We assume that the application

F

admits partial derivatives in the Fréchet sense at the point

(x_{n}, x_{n - 1}) \in X^{2}

and we consider the linear equation
(10)

F (x_{n}, x_{n - 1}) + F_{x}^{'} (x_{n}, x_{n - 1}) (x_{n + 1} - x_{n}) + F_{y}^{'} (x_{n}, x_{n - 1}) (x_{n} - x_{n - 1}) = θ

,
where the unknown element is

x_{n + 1}

.
If we assume that the application

F_{x}^{'} (x_{n}, x_{n - 1}) \in L (X, Y)

is reversible (by

L (X, Y)

we denote the set of linear and continuous applications defined on

X

and values in

Y

), then the solution

x_{n + 1}

of equation (10) has the following representation:
(11)

x_{n + 1} = x_{n} - Γ_{n} [F (x_{n}, x_{n - 1}) + F_{y}^{'} (x_{n}, x_{n - 1}) (x_{n} - x_{n - 1})]

Or

Γ_{n} = {[F_{x}^{'} (x_{n}, x_{n - 1})]}^{- 1}

Using method (11) for each

n = 0, 1, \dots

, we get a sequence

{(x_{n})}_{n = 0}^{\infty}

of approximations for the solution of equation (9).

Indeed, if the following

{(x_{n})}_{n = 0}^{\infty}

is convergent, and if

\bar{x} = lim x_{n}

, then it is obvious that

F (\bar{x}, \bar{x}) = G (\bar{x}) = θ

, that is to say that

\bar{x}

verifies equation (9).

To study the convergence of the proposed method, we consider the following system of inequalities:

\begin{matrix} (12) & δ_{n + 1} ≦ a δ_{n}^{2} + b \cdot δ_{n} \cdot δ_{n - 1} + c \cdot δ_{n - 1}^{2} + d \cdot δ_{n}, n = 1, 2, \dots \end{matrix}

\begin{matrix} (oû̀) & a ≧ 0, b ≧ 0, c ≧ 0, c ≧ 0, d ≧ 0, δ_{0} ≧ 0 et δ_{1} ≧ 0 . \end{matrix}

Relative to the system (12) we will demonstrate the following theorem: theorem, 2. If the elements of the sequence

{(δ_{n})}_{n = 0}^{\infty}

and real numbers

u, b, c

And

d

meet the following conditions:
(i)

δ_{n} ≧ 0

for each

n = 0, 1

,
(ii) the elements of the sequence

{(δ_{n})}_{n aco}^{\infty}

verify the inequalities (12);
(iii)

δ_{0} = k, d_{1} = k t_{1}

Or

k > 0

is a constant independent of

n

, And

t_{1}

is the positive solution of the equation

\begin{matrix} (13) & (a k - 1) t^{2} + (d + b k) t + c k = 0 \end{matrix}

(iv) the constants

a, b, c, d

And

k

satisfy the following inequality:

k (a + b + c) + d < 1

then:
(j) the elements of the sequence

{(δ_{n})}_{n = 0}^{\infty}

satisfy the inequalities

δ_{n} ≦ k l_{1}^{n}

,

n = 0, 1, \dots

;
(dd)

lim_{n \to \infty} δ_{n} = 0

;
(jjj) the series

\sum_{i = 0}^{\infty} δ_{i}

is convergent.
Proof. From (iv) we deduce that equation (13) has a root

t_{1}

,

0 ≦ t_{1} < 1

. Either

φ (t) = (a k - 1) t^{2} + (d + b k) t + c k

, so we have

φ (0) ≧ 0

And

φ (1) < 0

that is to say the equation considered admits a root

t_{1}

which verifies the inequality

0 ≦ t_{1} < 1

.

To demonstrate property (j) we will proceed by induction.
For

n = 0

And

n = 1

the inequalities of property (j) are verified by hypothesis. Suppose that property (j) holds for

n = 2, 3, \dots, s

and show that it takes place for

n = s + 1

. Indeed, of

0 ≦ t_{1} < 1

the following inequalities result:

\begin{matrix} (14) & δ_{i} ≦ k t_{1} ≦ k, i = 2, 3, \dots, s . \end{matrix}

We will now write inequalities (12) in the following form:

\begin{matrix} (15) & δ_{n + 1} ≦ (a δ_{n} + d) δ_{n} + (b δ_{n} + c δ_{n - 1}) δ_{n - 1}, n = 1, 2, \dots, \end{matrix}

\begin{matrix} (ou) & δ_{n - 1} ≦ u_{n} δ_{n} + v_{n} δ_{n - 1}, n = 1, 2, \dots \\ (16) & u_{n} = a δ_{n} + d et v_{n} = b δ_{n} + c δ_{n - 1}, n = 1, 2, \dots \end{matrix}

From (14), taking into account

u_{n}

And

v_{n}

we deduce

u_{n} ≦ a k t_{1} + d et v_{n} ≦ b k t_{1} + c k, n = 2, 3, \dots, s .

From (16), taking into account the above inequalities, we deduce

\begin{aligned} δ_{s + 1} ≦ (a k t_{1} + d) k t_{1}^{s} + (b k t_{1} + c k) k t_{1}^{s - 1} = \\ = k t_{1}^{s - 1} (a k t_{1}^{2} + (b k + d) t_{1} + c k) = k t_{1}^{s + 1} \end{aligned}

that's to say

δ_{s + 1} ≦ k t_{1}^{s + 1}

, from which the properties (jj) and (jjj) result.
Now we can prove the following theorem:
THEOREM 3. If the application

F

, the initial elements

x_{0}, x_{1} \in X

and the real number

r > 0

enjoy the following properties:
(i) the application F is derivable in the Fréchet sense, up to order 2 inclusive, with respect to

x

And

y

in each point of the set

S =

Int (S) where

\bar{S} = {x \in X : ‖ x - x_{0} ‖ ≦ r};

(ii) for each point

(x, y) \in S \times S

, the application

F_{x}^{'} (x, y)

is invertible and the operator

{[F_{x}^{'} (x, y)]}^{- 1}

is bounded on

S, c^{'}

that is to say

‖ {[F_{x}^{'} (x, y)]}^{- 1} ‖ ≦≦ α

, where a is a real and positive constant;
(iii) the application

{[F_{x}^{'} (x, y)]}^{- 1} F_{y}^{'} (x, y)

is uniformly bounded on

S \times S

, that's to say

‖ {[F_{x}^{'} (x, y)]}^{- 1} F_{y}^{'} (x, y) ‖ ≦ β

, Or

β

is a real and positive constant;
(iv) the following inequalities are verified

\begin{matrix} sup_{(x, y) \in S} ‖ F_{x^{2}}^{''} (x, y) ‖ ≦ 2 p; sup_{(x, y) \in S} ‖ F_{x y}^{''} (x, y) ‖ ≦ q et \\ sup_{(x, y) \in S} ‖ F_{y^{2}}^{''} (x, y) ‖ ≦ 2 r, \end{matrix}

where

p, q

And

r

are real and non-negative constants;
(v)

{‖ x_{1} - x_{0} ‖}^{'} ≦ k, ‖ x_{2} - x_{1} ‖ ≦ k t_{1}

Or

k

is a real and positive constant, which does not depend on

n, x_{2}

is given by (11) for

n = 1

And

t_{1}

is the positive root of the equation

(α p k - 1) t^{2} + (β + α q k) t + α r k = 0

(vi) the following inequalities hold:

\begin{matrix} (17) & k (p + q + r) + β < 1 \end{matrix}

4 - Mathematica - Volume 20 (43) No. 1/1978
and
(18)

γ ≧ k / (1 - t_{1}),

then equation (9) and procedure (11) enjoy the following properties:
(j) the sequence

{(x_{n})}_{n = 0}^{\infty}

is convergent and if

\bar{x} = lim_{n \to \infty} x_{n}

, SO

{\bar{x}}_{n}

is a solution to equation (9);
(jj) the following inequalities hold:

‖ x_{n} - \bar{x} ‖ ≦ k t_{1}^{n} / (1 - t_{1}),

for each

n = 0, 1, \dots

.
Demonstration. By hypothesis we have

x_{1} \in S

And

x_{2} \in S

. Suppose the following properties hold:
a)

δ_{k} = ‖ x_{k + 1} - x_{k} ‖ ≦ k t_{1}^{k}

, Or

k = 0, 1, \dots, n - 1

b)

x_{k} \in S

, Or

k = 0, 1 \dots, n

.

From the hypotheses of the theorem it follows that the properties

a

) And

b

) take place for

k = 0

And

k = 1

. Let us now show that these properties hold for

k = n

respectively

k = n + 1

Indeed, we deduce from Theorem 1

\begin{aligned} ‖ F (x_{s - 1}, x_{s}) ‖ ≦ & ‖ F (x_{s + 1}, x_{s}) - F (x_{s}, x_{s - 1}) - F_{x}^{'} (x_{s}, x_{s - 1}) (x_{s - 1} - x_{s}) - \\ - F_{y}^{'} (x_{s}, x_{s - 1}) (x_{s - 1} - x_{s - 1}) ‖ ≦ \\ ≦ p {‖ x_{s + 1} - x_{s} ‖}^{2} + q ‖ x_{s + 1} - x_{s} ‖ \cdot ‖ x_{s} - x_{s - 1} ‖ + \\ + r {‖ x_{s} - x_{s - 1} ‖}^{2} . \end{aligned}

Now taking into account (ii) and (iii) we have:
(20)

‖ x_{s + 1} - x_{s} ‖ ≦ α ‖ F (x_{s}, x_{s - 1}) ‖ + β ‖ x_{s} - x_{s - 1} ‖; s = 1, 2, \dots, n

. Either

ρ_{s} = ‖ F (x_{s - 1}, x_{s}) ‖

And

δ_{s} = ‖ x_{s + 1} - x_{s} ‖

For

s = 0, 1, \dots

Then from (19) and (20) we deduce

ρ_{s} ≦ p δ_{s}^{2} + q \cdot δ_{s} \cdot δ_{s - 1} + r \cdot δ_{s - 1}^{2}, s = 1, 2, \dots, n - 1

\begin{matrix} (21) & δ_{s} ≦ α \cdot ρ_{s - 1} + β δ_{s - 1}, s = 1, 2, \dots, n \end{matrix}

from which it results

\begin{matrix} (22) & δ_{s + 1} ≦ α p \cdot δ_{s}^{2} + α \cdot q \cdot δ_{s} \cdot δ_{s - 1} + α r \cdot δ_{s - 1}^{2} + β δ_{s^{'}} \end{matrix}

From (22), (17) and taking into account Theorem 2 we deduce
(23)

δ_{n} = ‖ x_{n + 1} - x_{n} ‖ ⩽ k t_{1}^{n},

that is to say it follows that property a) takes place for

k = n

.
For b) we have:

‖ x_{n + 1} - x_{0} ‖ ≦ \sum_{i = 0}^{n} ‖ x_{i + 1} - x_{i} ‖ = \sum_{i = 0}^{n} δ_{i} ≦ k \sum_{i = 0}^{n} t_{1}^{i} < \frac{k}{1 - t_{1}} ≦ r,

SO

x_{n + 1} \in S

.
Let us now show that the sequence

{(x_{n})}_{n = 0}^{\infty}

is convergent. Indeed, from (23) we deduce

\begin{matrix} ‖ x_{n + p} - x_{n} ‖ ≦ \sum_{k = n}^{n + p - 1} ‖ x_{k + 1} - x_{k} ‖ = \sum_{k = n}^{n + p - 1} δ_{k} ≦ k t_{1}^{n} (1 + t_{1} + \dots + t_{1}^{n + p - 1}) \\ ≦ k t_{1}^{n} / (1 - t_{1}) \end{matrix}

for each

p = 1, 2, \dots; n = 1, 2, \dots

.
From the above inequality, taking into account that

t_{1} < 1

, it follows that the following

{(x_{n})}_{n = 0}^{\infty}

is convergent.

If we pass to limit for gans the inequality (24) and we write

\bar{x} = lim x_{n}

we have

\begin{matrix} (25) & ‖ \bar{x} - x_{n} ‖ ≦ k t_{1}^{n} / (1 - t_{1}) \end{matrix}

Because

F

is a derivable application in the sense of Fréchet it follows that

F

is continuous; then passing to limit in (11) for

n \to \infty

, we have

θ = F (\bar{x}, \bar{x}) = G (\bar{x})

that's to say

\bar{x}

is a solution to equation (9).

BIBLIOGRAPHY

[1] Kantorovici, IV, Funktionaljnîi analiz i prikladnaia matematika, UMN 28, 89185, (1948).
[2] Păvalo1u, I., On iterative processes at a high order of convergence, Mathematica, (1970).
[3] Weinischke, JH, Über eine Klasse von Iterationsverfahren, Numerische Mathematik, 6, 395-404, (1964).

1978

A generalization of the Newton method

Abstract

Authors

Title

Original title (in French)

English translation of the title

Keywords

PDF

Cite this paper as:

About this paper

Journal

Publisher Name

DOI

Print ISSN

Online ISSN

References

Paper (preprint) in HTML form

A generalization of the Newton method

BIBLIOGRAPHY

Related Posts