Optimal algorithms concerning the solving of equations by interpolation

Abstract

In this paper we approach two aspects concerning the optimality problems arising from the consideration of the iterative methods for approximating the solutions of equations by inverse interpolation.

The first aspect concerns the construction of some algorithms having optimal convergence orders, while the second addresses the optimal complexity of calculus concerning the inverse interpolation iterative methods.

Authors

Ion Păvăloiu
Tiberiu Popoviciu Institute of Numerical Analysis

Keywords

nonlinear equations in \(\mathbb{R}\); inverse interpolation; convergence order; efficiency index; optimal iterative methods

PDF

Cite this paper as:

I. Păvăloiu, Optimal algorithms concerning the solving of equations by interpolation, Research on Theory of Allure, Approximation, Convexity and Optimization, Ed. Srima, Cluj-Napoca (1999), pp. 222-248, ISBN 973-98551-4-3.

About this paper

Journal

Research on Theory of Allure, Approximation, Convexity and Optimization

Publisher Name

Editura Srima

DOI

Not available yet.

Print ISBN

973-98551-4-3

Not available yet.

References

?

Paper (preprint) in HTML form

Optimal Algorithms Concerning the Solving of Equations by Interpolation

SÉMINAIRE DE LA THÉORIE DE LA MEILLEURE APPROXIMATION,

CONVEXITÉ ET PROGRAMMATION MATHÉMATIQUE

CLUJ - NAPOCA, 1999, pp. 222–248


Optimal Algorithms Concerning the Solving of Equations by Interpolation

Ion Păvăloiu
(Cluj - Napoca)

1. Introduction

It is well known that the most usual methods for approximating a solution of a nonlinear equation in (Newton’s method, Chebyshev’s method, chord method and different generalizations of these) are obtained in an unitary manner by Lagrange-Hermite-type inverse interpolation.

The inverse interpolatory polynomials, by a proper choice of the nodes, also lead to Aitken-Steffensen-type methods.

In this paper we approach two aspects concerning the optimality problems arising from the consideration of the iterative methods for approximating the solutions of equations by inverse interpolation. The first aspect concerns the construction of some algorithms having optimal convergence orders, while the second addresses the optimal complexity of calculus concerning the inverse interpolation iterative methods.

We adopt the efficiency index (see [6]) as a measure of the complexity of the iterative methods.

This paper represents a synthesis of the results obtained by us in the papers [3], [4], [7], [10], [11].

We shall begin by presenting some definitions and results (some of them are known) concerning the convergence order and the efficiency index of an iterative method. We briefly present then the inverse interpolatory methods and the iterative methods generated by them. We consider different classes of interpolatory methods determining for each class the methods having the optimal convergence order. Finally, we determine the methods having the optimal efficiency indexes.

2. Convergence orders and efficiency indexes

Denote I=[a,b], a,b, a<b, and consider the equation

(1) f(x)=0

where f:I is given. We shall assume for simplicity in the following that the above equation has a unique solution x¯I. Let g:II be a function having a unique fixed point and let that point be x¯.

For approximating the solution  x¯ we shall consider the elements of the sequence (xp)p0 generated by the iterations

(2) xs+1=g(xs),s=0,1,,x0I,

More general, if G:IkI is a function of k variables whose restriction to the diagonal of Ik coincides with g, i.e.

G(x,x,,x)=g(x),xI

then we may consider the iterations

(3) xs+k=G(xs,xs+1,,xs+k1),s=0,1,,x0,,xk1I.

The convergence orders of the sequences (xp)p1 generated by (2) and (3) depend on some properties of the function f, g, resp. G.

The amount of time needed by a computer to obtain a convenient approximation depends both on the convergence order of (xp)p0 and on the number of elementary operations that must be performed at each iteration step in (2) or (3). The convergence order of the methods of the form (2) and (3) may be determined exactly under some circumstances, but the number of elementary operations needed at each iteration step may be hard or even impossible to evaluate. A simplification of this problem may be obtained (see [6]) by taking into account the number of function evaluations needed at each iteration step.

It is obvious that this criterion may be, at the first sight, contested, since some functions may be simpler and others may be more complicated from the calculus viewpoint.

This inconvenient does not affect our viewpoint on optimal efficiency, because it refers on classes of iterative methods which are applied for solving an equation in which the functions are well determined by the form of equation (1), and by g, resp. G.

Let (xp)p0 be an arbitrary sequence which together with f and g satisfies

  • i.

    xsI and g(xs)I for s=0,1,;

  • ii.

    the sequence (xp)p0 converges and limxp=limg(xp)=x¯;

  • iii.

    f is derivable at x¯;

  • iv.

    for any x,yI it follows 0<|[x,y;f]|m, for some m, m>0, where [x,y;f] denotes the first order divided difference of f on the nodes x and y.

Definition 2.1.

The sequence (xp)p0 has the convergence order ω, ω1, with respect

to g if there exists the limit

(4) α=limpln|g(xp)x¯|ln|xpx¯|

and α=ω.

For a unitary treatment of the convergence orders of the studied methods we shall prove the following lemmas.

Lemma 2.1.

If the sequence (xp)p0 and the functions f and g satisfy properties i–iv then the necessary and sufficient condition for this sequence to have the convergence order ω, ω1, is that there exists

(5) β=limln|f(g(xp))|ln|f(xp)|

and β=ω.

Proof.

Assuming true one of the relations (4) and (5) and taking into account hypotheses i–iv, we get

limln|g(xp)x¯|ln|xpx¯| =limln|f(g(xp))|ln|[g(xp),x¯;f]|ln|f(xp)|ln|[xp,x¯;f]|
=limln|f(g(xp))|ln|f(xp)|1ln|[g(xp),x¯;f]|ln|[f(g(xp))]|1ln|[xp,x¯;f]|ln|f(xp)|
=limln|f(g(xp))|ln|f(xp)|.

Lemma 2.2.

Assume that (up)p0 is a sequence of real positive numbers satisfying the following properties:

  • i.

    the sequence (up)p0 is convergent and limup=0;

  • ii.

    there exist the real nonnegative numbers α1, α2,,αn+1 and a sequence (cp)p0 with cs>0, s=0,1, and 0<inf{cp}sup{cp}m, which together with the elements of the sequence (up)p0 satisfy

    (6) us+n+1=csusα1us+1α2us+nαn+1,s=0,1,,
  • iii.

    there exists limlnup+1lnup=ω>0.

    Then ω is the positive root of the equation

    (7) tn+1αn+1tnαntn1α2tα1=0.
Proof.

By (6) we obtain

(8) limslnun+s+1lnun+s=limslncslnun+1+i=0nαi+1limslnus+ilnus+n

The hypotheses imply

limslncslnun+s=0

and

limlnus+ilnus+n=1ωni,i=0,n,¯

whence, by (8) we get

ω=i=0nαi+11ωni,

i.e.,

(9) ωn+1i=0nαi+1ωi=0.

We turn now our attention to equation of the form (9).

Let a1, a2,,an+1, ai0=1,n+1¯.

We shall assume that the numbers ai, i=1,,n+1 are ordered:

(10) an+1ana2>a1

and satisfy

(11) a1+a2++an+1>1.

Consider the equations

(12) P(t)=tn+1an+1tnantn1a2ta1=0
(13) Q(t)=tn+1a1tna2tn1antan+1=0
(14) R(t)=tn+1ai1tnai2tn1aintain+1=0

where (i1,i2,,in+1) is an arbitrary permutation of (1,2,,n+1).

Lemma 2.3.

If ai,i=1,n+1¯satisfy condition (11) then any equation of form (14) has a unique root larger than 1. Moreover, if relations (10) are satisfied and if we denote by a,b,c the positive roots of (12), (13) resp. (14), then

(15) 1<bca,

i.e., equation (12) has the largest root.

Proof.

Consider the (n+1)! equations of the form (14) and denote by s the largest natural number for which  ais0.

We have ais+1=ais+2==ain+1=0. Consider the function Ψ(t)=R(t)/tns+1. It can be seen by (11) that Ψ(1)=1ai1ai2ais<0, and limtΨ(t)=+. It follows that equation (14) has a unique positive root. The first part of the lemma is proved. In order to prove inequality (15) it suffices to show that R(b)0 and R(a)0. Indeed,

R(b) =R(b)Q(b)=(a1ai1)bn+(a2ai2)bn1+
+(anain)b+an+1ain+1
=(b1)(a1ai1)bn1+[(a1+a2ai1ai2)bn2+
+(a1+a2++an1ai1ai2ain1)b
+a1+a2++anai1ai2ain]0,

since from (15) follow the inequalities

a1+a2++asai1ai2ais0,s=1,2,,n,

and b>1. The fact that R(a)0 is shown in an analogous manner. ∎

Lemma 2.4.

Let p1,p2,pn+1 and α1,α2,,αn+1, where pi1, αi1, i=1,n+1¯, be two sets of real numbers satisfying

(16) p1p2pn+1,α1α2αn+1.

Then, among all the numbers of the form

(17) α=αj1pk1+αj2pk1pk2++αjn+1pk1pk2pkn+1

where (j1,j2,,jn+1) and (k1,k2,,kn+1) are arbitrary permutations of (1,2,,n+1), the largest such number is given by

(18) αmax=α1p1+α2p1p2++αn+1p1p2pn+1.
Proof.

From the first set of inequalities (16) it follows that the inequality:

(19) αj1pk1+αj2pk1pk2++αjn+1pk1pk2pkn+1
αj1p1+αj2p1p2++αjn+1p1p2pn+1

holds for any two permutations (j1,j2,,jn+1) and (k1,k2,,kn+1) of (1,2,,n+1).

Let us denote

(20) bi=p1p2pi,i=1,2,,n+1.

In order to prove the inequality

(21) αj1b1+αj2b2++αjn+1bn+1α1b1+α2b2++αn+1bn+1

for every permutation (j1,j2,,jn+1), we shall proceed by induction. For n=0 the inequality (21) is obvious, since n+1 =1 and hence αj1=α1. Suppose now that the inequality is true for n pairs of numbers (α1,b1),,(αn,bn), namely

(22) αj1b1+αj2b2++αjnbnα1b1++αnbn,

where α1α2αn and b1b2bn. Using the inequalities b1b2bnbn+1 and α1α2αnαn+1, as well as the induction hypothesis (22) and assuming that j1=i, 1in, we have

αj1b1+αj2b2++αjn+1bn+1=
=b1(αj1+αj2++αjn+1)+(b2b1)αj2+(b3b1)αj3++(bn+1b1)αjn+1
b1(α1+α2++αn+1)+(b2b1)α1+(b3b1)α2+
+(bib1)αi1+(bi+1b1)αi+1++(bn+1b1)αn+1
b1(α1+α2++αn+1)+(b2b1)α3+
+(bib1)αi+(bi+1b1)αi+1++(bn+1b1)αn+1
=b1α1+b2α2++bn+1αn+1.

We turn back our attention to the equation

(23) tn+1an+1tnantn1a2ta1=0

and we assume that ai1, ai, i=1,n+1¯ and i=1n+1ai=m+1, m. Denote by δn+1 the positive root of the above equation. The following result holds.

Lemma 2.5.

[7] The positive solution δn+1 of equation (23) verifies the relation:

(24) (m+1)m+1(n+1)(m+1)i=1n+1(i1)αi δn+11+max1in+1{αi},n=1,2,
Proof.

Let

(25) α=(m+1)m+1(n+1)(m+1)i=1n+1(i1)αi

It is sufficient to prove that Pn+1(α)0, where Pn+1(t) =tn+1an+1tna2ta1.

We shall use for this the inequality between the arithmetic mean and the geometric mean, i.e.

i=1n+1αipii=1n+1pi (i=1n+1αipi)1i=1npi,αi>0,pi0,i=1,n+1¯,i=1n+1pi>0.

Using this inequality we obtain

Pn+1(α) =αn+1i=1n+1aiαi1=αn+1i=1n+1aiαi1i=1n+1αii=1n+1ai
αn+1(i=1n+1ai)(i=1n+1α(i1)ai)1i=1n+1ai
=αn+1(m+1)(i=1n+1α(i1)αi)1m+1
=αn+1(m+1)αi=1n+1(i1)αi
=i=1n+1(i1)aim+1[α(n+1)i=1n+1(i1)aim+1(m+1)]=0,

i.e. Pn+1(α)0.

Remark 2.1.

It can be easily seen that the number α given by (25) can be expressed using Pn+1(1):

α=(m+1)m+1m(n+1)+Pn+1(1).

The second part of relations (24) follows easily from the inequality Pn+1(a)>0, where a=1+max1in+1{αi}.

Some more specific results concerning the bounds for the root δn+1 of equation (23) can be obtained in the case

(26) a1=a2==an+1=q,q1.

More precisely, denoting by γn(q) the positive root of equation

(27) tn+1qtnqtn1qtq=0,

then the following relations hold (see [15]):

  • a)

    γn(q)<γn+1(q),n=1,2,;

  • b)

    max{q,n+1n+2(q+1)}<γn+1(q)q+1, n=1,2,;

  • c)

    limnγn(q)=q+1.

    For q=1, from relations a)–c) we get (see [6]):

  • a’)

    γn(1)γn+1(1), n=1,2,

  • b’)

    2(n+1)n+2<γn+1(1)<2, n=1,2,;

  • c’)

    limγn(1)=2.

In the following we shall denote by mp the number of function evaluations that must be performed when passing from step p to step p+1 in the iterative methods (2), resp. (3), for p=1,2,

Concerning the efficiency index of methods (2) and (3), taking into account Lemma 2.1 and the definition given in [6], we get

Definition 2.2.

The real number E is called the efficiency index of the iterative method (2) and (3) if there exists

L=lim(ln|f(xp+1)|1mpln|f(xp)|)

and L=E.

Remark 2.2.

If for methods (2) and (3) there exists a natural number s0 such that ms=r for all s>s0 and ω is the convergence order of these methods, then the efficiency index E is given by the following expression:

(28) E=ω1r

3. Iterative methods of interpolatory type

In the following we shall briefly present the Lagrange-Hermite-type inverse interpolatory polynomial. It is well known that this leads us to general classes of iterative methods from which, by suitable particularizations we obtain usual methods as Newton’s method, chord method, Chebyshev’s method, etc.

For the sake of simplicity we prefer to treat separately the Hermite polynomial and the Lagrange polynomial, though the last is a particular case of the first.

As we shall see, a suitable choice of the nodes enables us to improve the convergence orders of Lagrange-Hermite-type methods. We shall call such methods Steffensen-type methods.

3.1. Lagrange-type inverse interpolation

Denote by F=f(I) the range of f for xI. Suppose f is n+1 times differentiable and f(x)0 for all xI. It follows that f is invertible and there exists f1:FI. Consider n+1 interpolation nodes in I:

(29) x1,x2,,xn+1,xixj,for i,j=1,n+1¯,ij.

In the above hypotheses it follows that the solution x¯ of equation (1) is given by

x¯=f1(0).

Using the Lagrange interpolatory polynomial for the function f1 at the nodes f(x1),, f(xn+1) we shall determine an approximation for f1(0), i.e. for x¯.

Denote  yi=f(xi),i=1,n+1¯ and let L(y1,y2,,yn+1;f1|y) be the mentioned polynomial, which is known to have the form

L(y1,y2,,yn+1;f1|y)=i=1nxiω1(y)(yyi)ω1(yi),

where ω1(y)=i=1n+1(yyi).

The following equality holds

(30) f1(y)=L(y1,y2,,yn+1;f1|y)+R(f1,y)

where

R(f1,y)=[f1(θ1)](n+1)(n+1)!ω1(y)

and min{y,f(x1),,f(xn+1)}<θ1<max{y,f(x1),,f(xn+1)}.

It is also known that under the mentioned hypotheses concerning the derivability of f on I, the function f1 admits derivatives of any order k,1kn+1 for all yF and the following equality holds [12], [16]:

(31) [f1(y)](k) =(2ki12)!(1)k+i11i2!i3!ik![f(x)]2k1(f(x)1!)i1(f′′(x)2!)i2(f(k)(x)k!)ik,k=1,n+1¯

where y=f(x) and the above sum extends over all nonnegative integer solutions of the system

{i2+2i3++(k1)ik=k1i1+i2++ik=k1.

From (30), neglecting R(f1,0) we obtain the following approximation for x¯

x¯L(y1,y2,,yn+1;f1|0).

Denoting

xn+2=L(y1,y2,,yn+1;f1|0),

we obtain

|xn+2x¯|=|[f1(θ1)](n+1)|(n+1)!|ω1(0)|,

where min{0,f(x1),,f(xn+1)}<θ1<max{0,f(x1),,f(xn+1)}.

It is clear that if xs,xs+1,,xs+n are n+1 distinct approximations of the solution x¯ of equation (1) then a new approximation xs+n+1 can be obtained as above, i.e.

(32) xs+n+1=L(ys,ys+1,,ys+n;f1|0),s=1,2,

with the error estimate given by

(33) |xs+n+1x¯|=|[f1(θs)](n+1)|(n+1)!i=0n|f(xs+1)|,s=1,2,

where θs belongs to the smallest open interval contianing 0,f(xs),,f(xs+n).

If we  replace in (33) |xs+n+1x¯|=|f(xs+n+1)||f(αs)|, we obtain for the sequence (f(xp))p0 the relations:

(34) |f(xs+n+1)|=|f(αs)||[f1(θs)](n+1)|(n+1)!i=0n|f(xs+i)|,

where αs, belongs to the open interval determined by x¯ and xs+n+1.

Suppose that cs=|f(αs)||[f1(θs)](n+1)|(n+1)!, s, satisfies the hypotheses of Lemma 2.2 and that the sequence (f(xp))p0, converges to zero, where (xp)p0 is generated by (32). Then the converges order of this sequence is equal to the positive solution of the equation:

tn+1tntn1t1=0.

3.2. Hermite-type inverse interpolation

Consider in the following, besides the interpolation nodes (29), n+1 natural numbers a1,a2,,an+1, where ai1, i=1,n+1¯ and

a1+a2++an+1=m+1.

We shall suppose here too, for simplicity, that f is m+1 times differentiable on I. From this and from f(x)0 for all xI, it follows, by (31), that f1 is also m+1 times differentiable on F. Denoting yi=f(xi),i=1,n+1¯, then the Hermite polynomial for the nodes yi,i=1,n+1¯, has the following form:

(35) H(y1,a1;y2,a2;;yn+1,an+1;f1|y)=
=i=1n+1j=0ai1k=0aij1(f1(yi))(j)1k!j!((yyi)aiω1(y))y=yi(k)ω1(y)(yyi)aijk

where

ω1(y)=i=1n+1(yyi)ai.

If xs,xs+1,,xs+n are n+1 distinct approximations of the solution x¯ of the equation (1), then the next approximation xs+n+1 can be obtained as before in the following way:

(36) xs+n+1=H(ys,a1;;ys+n,an+1;f1|0),s=1,2,

where, as in (35),

ωs(y)=i=ss+n(yyi)ai.

It can be easily seen that the following equality holds:

(37) |f(xs+n+1)| =|f(βs)||[f1(θs′′)](m+1)|(m+1)!i=0n|f(xs+i)|ai+1,s=1,2,

where θs′′ belongs to the smallest open interval containing 0,ys,ys+1,,ys+n and βs belongs to the open interval determined by x¯ and xs+n+1.

If we suppose that

cs=|f(βs)||[f1(θs′′)](m+1)|(m+1)!,s,

verify the hypotheses of Lemma 2.2 and, moreover, limsf(xs)=0, then it is clear that the convergence order of the method (36) is given by the positive solution of the equation

(38) tn+1an+1tnantn1a2ta1=0.

In the following we shall consider a particular case of (36).

For a1=a2==an+1=q, from (36) we obtain

(39) xs+n+1=H(ys,q;ys+1,q;;ys+n,q;f1|0),

method having the convergence order given by the positive solution of the equation

(40) tn+1qtnqtn1qtq=0.

3.3. Aitken-Steffensen type iterative methods

Let φi:I, i=1,,n+1 be n+1 functions having the following properties

  • α)

    φi(x¯)=x¯, i=1,n+1¯, where x¯ is the solution of (1);

  • β)

    there exist n+1 continuous functions gi:I, gi(x)0xI, and the real numbers pi>1, i=1,n+1¯ such that the following equalities hold:

    (41) |f(φi(x))|=gi(x)|f(x)|pi,i=1,n+1.¯

Denote u0I an initial approximation of the root x¯ of (1). We construct the  n+1 interpolation nodes xi1, i=1,n+1¯ in the following way:

(42) x11=φ1(u0),xi+11=φi+1(xi1),i=1,n¯.

Next, we compute yi1=f(xi1),i=1,n+1¯ and we consider the natural numbers αi, i=1,n+1¯ such that

α1+α2++αn+1=m+1.

Taking as interpolation nodes the numbers yi1,i=1,n+1¯ and the Hermite interpolatory polynomial determined by these nodes with the corresponding multiplicities αi, i=1,n+1¯, we obtain for x¯ the following approximation:

(43) u1=H(yi1,α1;y21,α2;;yn+11,αn+1;f1|0).

The error is given by

(44) |x¯u1|=|[f1(ξ1)](m+1)|(m+1)!|ω1(0)|

where ξ1 is a point belonging to the smallest interval determined by the points 0, and yi,i=1,n+1¯, while ω1 has the following form:

(45) |ω1(0)|=|f(x11)|αi|f(x21)|α2|f(xn+11)|αn+1.

Taking into account hypothesis β) for the functions φi, we get

|f(x11)| =|f(φ1(x0))|=g1(u0)|f(u0)|p1
|f(x21)| =g2(x11)|f(x11)|p2g2(x11)g1p2(u0)|f(u0)|p1p2

and in general

(46) |f(xi+11)| =gi+1(xi+11)(gi(xi1))pi+1(g1(x11))p2p3pi+1|f(x0)|p1p2pi+1,i=1,n¯.

Denote

α=i=1n+1αij=1ipj

and

(47) ρ(u0)=i=1n+1[gi(xi1)]θi

where

θi=αi+j=i+1n+1αjk=i+1jpk.

With these notations, from (44)–(46) we obtain

(48) |x¯u1|=[f1(ξ1)](m+1)ρ0(m+1)!|ρ(u0)|α.

Let uk1 be an arbitrary approximation of the solution x¯, obtained by the continuation of the process given by (43). Then the next approximation is constructed in the following way.

Consider the interpolation nodes xik,i=1,n+1¯ given by the relations

x1k=φ1(uk1),xi+1k=φi+1(xik),i=1,n¯,k2.

Then uk is given by

(49) uk=H(y1k,α1;y2k,;α2;;yn+1k,αn+1;f1|0),

where yik=f(xik),i=1,n+1¯, with the error estimation

(50) |x¯uk|=ρk1[f1(ξk)](m+1)(m+1)|f(uk1)|α,k=2,3,

where ξk is a point belonging to the smallest interval determined by 0 and yik,i=1,n+1¯, and ρk1 has an analogous form with that given in (47) for ρ0,

From (50) we get

(51) |f(uk)|=ρk1[f1(ξk)](m+1)β(m+1)!|f(uk1)|α,k=2,3,

where β=maxxI|f(x)|.

It is obvious now that if limuk=x¯, then the convergence order of the process (49) is α, where

(52) α=i=1n+1αij=1ipj.

We shall consider in the following the particular case when

φ1=φ2==φn+1=φand p1=p2=pn+1=1.

We assume that f and φ satisfy

(53) |f(φ(x))|=g(x)|f(x)|

where g:I, g(x)>0 for all xI.

Let xsI be an approximation for the solution x¯. Denote us=xs,us+1=φ(us),, us+n=φ(us+n1) and y¯s=f(us),,y¯s+n=f(us+n). Taking into account the above assumption, by (32) we get the following Steffensen type method:

(54) xs+1=L(y¯s;y¯s+1,,y¯s+n;f1|0),x1I,s=1,2,

Similarly, by (36) it follows:

(55) xs+1 =H(y¯s,a1;y¯s+1;a2;;y¯s+n,an+1;f1|0),s=1,2,,x1I.

By (53) we obtain the following representations for y¯s+i, i=1,n¯:

y¯s+i=f(us+i)=ps,i1f(xs),i=1,n,¯

where

ps,i1=j=ss+i1g(uj).

Taking into account the above considerations, by (34) we obtain:

(56) |f(xs+1)| =|f(αs)||[f1(μs)](n+1)|(n+1)!i=1nps,i1|f(xs)|n+1,s=1,2,,

and analogously, by (37) we get

(57) |f(xs+1)| =|f(βs)||[f1(μs)](m+1)|(m+1)!i=1n+1ps,i1ai|f(xs)|m+1,s=1,2,,

From Lemma 2.1, it follows that methods (56) and (57) have the convergence orders n+1, respectively m+1.

3.4. Some particular cases

In what follows we shall discuss some particular cases.

The case n=0. From (35) one obtains the Taylor inverse interpolating polynomial:

(58) T(y) =x1+[f1(y1)]1!(yy1)++[f1(y1)](α11)(α11)!(yy1)α11

while, from (31), we obtain the following expressions for the successive derivatives [f1(y)](k),k=1,2,3,4:

(59) [f1(y)]=1f(x),
(60) [f1(y)]′′=f′′(x)[f(x)]3,
(61) [f1(y)]′′′=f′′′(x)f(x)3[f′′(x)]2[f(x)]5,
(62) [f1(y)](4) =[f(x)]2f(4)(x)+10f(x)f′′(x)f′′′(x)15[f′′(x)]3[f(x)]7

From (59) and (58) for α1=2 we obtain:

T(y)=x1+1f(x1)(yf(x1)),

which, for y=0, leads to the approximation x2 of x¯ given by the expression

(63) x2=x1f(x1)f(x1),

i.e., to the Newton’s method.

From (59), (60) and (58) for α1=3 we obtain Chebyshev’s method, i.e.:

(64) x2=x1f(x1)f(x1)12f′′(x1f2(x1))[f(x1)]3.

Finally, from (59), (60) (61) and (58) for α1=4 we obtain:

(65) x2 =x1f(x1)f(x1)12f′′(x1)f2(x1)[f(x1)]3+f′′′(x)f(x1)3[f′′(x1)]26[f(x1)]5.

From the above methods one obtains by iterations the corresponding sequence of approximations, which has the convergence orders 2, 3 and respectively 4.

As one may notice from (62) and (36), for α15 the expressions for the derivatives [f1(y)](k),k4, have a more complex from. That is why the methods following from (58) in these cases are also complex.

The case n=1. In this case, from (35) it follows:

(66) P(y)=i=12j=0αi1k=0αij1[f1(yi)](j)1k!j![(yyi)αiω(y)]y=yi(k)ω(y)(yyi)αijk

where:

(67) ω(y)=(yy1)α1(yy2)α2.

From (66) one obtains two iterative methods; namely denoting as above by H(y1,α1;y2,α2;f1|y) the Hermite inverse interpolating polynomial (66), we find:

(68) {x3=H(y1,α1;y2,α2;f1|0),x1,x2I,y1=f(x1),y2=f(x2),xn+1=H(yn1,α1;yn,α2;f1|0),n=3,4,,

or

(69) {x3=H(y1,α2;y2,α1;f1|0),x1,x2I,y1=f(x1,)y2=f(x2),xn+1=H(yn1,α2;yn,α1;f1|0),n=3,4,,

The characteristic equations which provide the convergence orders for the two methods are:

(70) t2α2tα1=0

for method (68), and:

(71) t2α1tα2=0

for the method (69).

If we denote by ω1and respectively ω2, the positive roots of equations (70) and (71), then it is clear that α2α1 implies ω2ω1; so, the method with optimal convergence order is the method (68).

Now, we shall briefly discuss some particular cases.

From (66), for α1=α2=1, we obtain

(72) P1(y)=(y1y2)1[(yy2)f1(y1)(yy1)f1(y2)]

whence, taking into account the fact f1(y1)=x1 and f1(y2)=x2, we find for y=0

(73) x3=x1x2x1f(x2)f(x1)f(x1)=x1f(x1)[x1,x2;f],

where [x1,x2;f] stands for the first order divided difference of the function f on the nodes x1 and x2 and in general,

(74) xn+1=xn1f(xn1)[xn1,xn;f],n=3,4,,

which is the chord method. In this case, since α1=α2, the above method has the same convergence order as the other one, which follows from (74), i.e.:

(75) xn+1=xnf(xn)[xn1,xn;f],n=2,3,.

The convergence order of the method (74) is ω1=12(1+5).

Now we shall discuss the case α1=1, α2=2. In this particular case, we obtain from (66) the following iterative methods:

(76) xn+2= xnxn+1xnf(xn+1)f(xn)f(xn)+f(xn+1)f(xn)(xn+1xn)f(xn+1)[f(xn+1)f(xn)]2f(xn+1)f(xn)f(xn+1),
n=1,2,,x1,x2I

and

(77) xn+2 =xn+1xnxn+1f(xn)f(xn+1)f(xn+1)+f(xn)f(xn+1)(xnxn+1)f(xn)[f(xn)f(xn+1)]2f(xn)f(xn)f(xn+1).

Solving the corresponding characteristic equations, we find the convergence orders ω1=1+2for the method (76) and ω2=2 for the method (77).

As we showed above, the Hermite inverse interpolating polynomial leads to a large class of iterative methods. The convergence order of each method depends on the number of interpolating nodes, the order of multiplicity of these ones, and essentially, on the interpolating node replaced at each iteration step by that calculated at the previous one.

As Steffensen noticed, in the case of method (74), the convergence order of this method can be increased if at each iteration step the element xn depends in a certain manner on xn1. More exactly, if we consider a function φ:I  having the property φ(x¯)=x¯, where x¯ is the root of the equation (1), and if we put xn=φ(xn1) into (74), then we obtain the sequence (xn)n1 generated by Steffensen’s method:

xn=xn1f(xn1)[xn1,φ(xn1);f],n=2,3,,

which has, as it is well known, the convergence order 2.

4. Optimal convergence order

4.1. Optimal convergence order of the iterative methods of Hermite type

As we have seen in the particular cases presented at §3.4, in the case n=1, the Hermite inverse interpolation polynomial for α1α2, leads to two different iterative methods (see (68) and (69)). From these two, methods (68) has a convergence order greater than the other one. In the following we shall use Lemma 2.3 in order to generalize the iterative methods (68) and (69). It is clear that the convergence order of method (36) depends on the multiplicity of the interpolation nodes which are replaced at each iteration step such that we are led to different configurations of the coefficients in equation (38). Formula (36) generates (n+1)! iterative methods, with respect to the algorithm of changing the interpolation nodes at each iteration step. Among those (n+1)! methods, we shall determine in the following the method with the highest convergence order, i.e. the optimal method. For this purpose we shall do as follows.

Consider the permutation i1, i2,,in+1 of the numbers 1,2,,n+1 for which the natural numbers α1,α2,,αn+1 satisfying the equality α1+α2++αn+1=m+1, can be increasingly ordered, namely:

(78) αi1αi2αinαin+1.

We renumber, accordingly, the elements of the set E, i.e. we consider:

E={xi1,xi2,,xin+1}.

For the sake of clearness we shall set:

(79) as=αis,s=1,2,,n+1

and

(80) us=xis,s=1,2,,n+1,

and denote by H(y1,a1;y2,a2;,yn+1,an+1;f1|x) the Hermite interpolating polynomial, corresponding to the nodes yi=f(ui),i=1,2,,n+1, having the multiplicities a1,a2,,an+1 respectively.

Let u1,u2,,un+1 be the n+1 initial approximation of the root x¯ of the equation (1). We construct the sequence (up)p1 by means of the following iterative procedure:

(81) {un+2=H(y1,a1;y2,a2;;yn+1,an+1;f1|0),,un+s+1=Hys,a1;ys+1,a2;;ys+n,an+1;f1|0),s=2,3,

Consider all (n+1)! permutations of the set {1,2,,n+1}. To each permutation i1,i2,, in+1 it corresponds an iterative method of the form:

(82) {xn+2=H(yi1,αi1;yi2,αi2;;yin+1,αin+1;f|0);xn+s+2=H(yi1+s,αi1;yi2+s,αi1;yi2+s,αi2;;yin+1+s,αin+1;f|0),s=1,2,

All together we have (n+1)! iterative methods.

Taking  into account Lemma 2.3 and the results proved so far, we can state the following theorem:

Theorem 4.1.

Out of the (n+1)! iterative methods of the form (82), with the greatest convergence order (namely these which provide the best upper limit for the absolute value of the error) is that determined by the permutation i1,i2,,in+1, which orders increasingly the numbers αi1,αi2,,αin+1, namely αi1αi2αin+1.

4.2. Aitken-Steffensen-type optimal methods

In the following we shall solve an optimization problem, analogous with the one treated at section 4.1, but now for the case of the Aitken-Steffensen method.

We shall consider the methods of type (49), and for determining the optimal algorithm we shall rely on Lemma 2.4.

Let (k1,k2,,kn+1) and (j1,j2,,jn+1) be two arbitrary permutations of numbers 1,2,,n+1. Also denote

H(y)=H(yk11,αj1;yk21,αj2;;ykn+11,αjn+1;f|y)

the Hermite inverse interpolating polynomial having the interpolating notes yki, with the orders of multiplicity αji,i=1,2,,n+1.

With the above notations, let us consider the following class of iterative methods

(83) us=H(yk1s,αj1;yk2s,αj2;;ykn+1s,αjn+1;f|0),s+1,2,

where

ykis=f(xkis),i=1,2,,n+1;s=1,2,,

and

(84) xk1s =φk1(us1),
xkis =φki(xki1s),i=2,3,,n+1;s=1,2,,

u0 being the given initial approximation.

To each couple of permutations (k1,k2,,kn+1) and (j1,j2,,jn+1) of the numbers 1,2,,(n+1)! there corresponds an iterative method of the form (83). All together we have again iterative methods of this form.

We shall attempt to determine, out of the (n+1)! iterative methods , that one for which the number α given by (52) is maximum.

Theorem 4.2.

Out of all the (n+1)! iterative methods of the form (83)–(84), the one for which the convergence order α given by (52) attains the maximum value, is the method determined by the order of the numbers pi,αi,i=1,2,,n+1, given by the inequalities (16).

The proof of this theorem follows immediately from Lemma 2.4 and (52).

5. Optimal efficiency

We shall analyse in the following the efficiency index of each of the methods described and in the hypotheses adopted below we shall determine the optimal methods, i.e. those having the highest efficiency index.

As we have seen, the formulae for computing the derivatives of f1 have a complicated form and they depend on the successive derivatives of f. Though, in the case where the orders of the derivatives of f1 are low, the values of these derivatives are obtained by only a few elementary operations. Taking into account the generality of the problem we shall consider each computation of the values of any derivative of f1 by (31) as a single function evaluation. For similar reasons we shall also consider each computation of the inverse interpolatory polynomials as a single function evaluation.

As it will follow from our reasonings, the methods having the optimal efficiency index are generally the simple ones, using one or two interpolation nodes and the derivatives of f1 up to the second order.

Remark that in our case we can use for the efficiency index relation (28).

5.1. Optimal Chebyshev-type methods

Taking n=0 in (36) we obtain again Chebyshev’s method, i.e.

(85) xs+1 =xs[f1(ys)]1!f(xs)+[f1(ys)]′′2!f2(xs)++(1)m[f1(ys)](m)m!fm(xs),

s=1,2,, where ys=f(xs), the convergence order being m+1.

Observe that for passing from the sth iteration step to the s+1, in method (85) the following evaluations must be performed:

f(xs),f(xs),,f(m)(xs),

i.e. m+1 values.

Then, by (31), we perform the following m function evaluations:

[f1(ys)],[f1(ys)]′′,,[f1(ys)](m),

where ys=f(xs). Finally, for the right-hand expression of relation (85) we perform another function evaluation, so that 2(m+1) function evaluations must be performed.

By (28) the efficiency index of method (85) has the form

E(m)=(m+1)12(m+1),E:.

Considering the function h:(0,),h(t)=t12t, we observe that it attains its maximum at t=e, so that the maximum value of E is attained for m=2. We have proved the following result:

Theorem 5.1.

Among the Chebyshev-type iterative methods having the form (85), the method with the highest efficiency index is the third order method, i.e.

(86) xs+1 =xsf(xs)f(xs)12f′′(xs)f2(xs)[f(xs)]3,s=0,1,,x0I.

In the following table some approximate values of E are listed:

m 1 2 3 4 5
E(m) 1.1892 1.2009 1.1892 1.1746 1.1610
Table 1.

We note that E(2)1.2009.

5.2. The efficiency of Lagrange-type methods

We shall study the methods of the form (32), for which the convergence order verifies a’)–c’) from §2. Taking into account Remark 2.1, it can be easily seen that we can use relation (28) for the efficiency index of these methods. For each s+n+1 step s2, in (32) in order to pass to the next step, only f(xs+n+1) must be evaluated, the other values from (32) being already computed. We have also another function evaluation in computing the right-hand side of relation (32). So two function evaluations are needed. Taking into account that the convergence order γn+1(1) of each method satisfies a’)–c’), and denoting by En+1 the corresponding efficiency index, we have

En+1 =[γn+1(1)]12,n=1,2,,
En <En+1,n=2,3,

and

limEn=2.

We have proved:

Theorem 5.2.

For the class of iterative methods of the form (32) the efficiency index is increasing with respect to the number of interpolation nodes, and we have the equality

limEn=2.

5.3. Optimal Hermite-type particular methods

We shall study the class of iterative methods of the form (39) for q>1.

Taking into account Remark 2.2 it is clear that we can use again relation (28) for the efficiency index.

If xn+j is an approximation for the solution x¯ obtained by (39) then for passing to the following iteration step we need

f(xn+j),f(xn+j),,f(q1)(xn+j),

i.e. q function evaluations. Then, by (31) we must compute the derivatives of the inverse function [f(yn+j)1](i),i=1,q1¯, where yn+j=f(xn+j). Another function evaluation is needed for computing the right-hand side of relation (39). We totally have 2q function evaluations, the other values in (39) being already computed.

By a)–c) from Remark 2.2 and denoting by E(γn+1(q),q) the efficiency of methods, of the form (39), we get:

(87) E(γn+1(q),q)>E(γn(q),q),n1,q>1;
(88) (max{q,n+1n+2(q+1)})12q <E(γn+1(q),q)<(q+1)12q,n1,q>1.

For a fixed q, by (87) it follows that the efficiency index is an increasing function with respect to n and

limE(γn+1(q),q)=(q+1)12q.

In the following we shall study E(γn(q),q) as a function of q>1 and n2,q,n.

By (88) we have

q12q<E(γn+1(q),q)<(q+1)12q,for qn+1

and

(89) [n+1n+2(q+1)]12q <E(γn+1(q),q)<(q+1)12q,for q<n+1.

For qn+1 consider the functions h:(0,+),h(t)=t12t and l:(0,+), l(t)=(t+1)12t.

Some elementary considerations show that h and l satisfy limt0h(t)=0,limth(t)=1,h is increasing on (0,e) and decreasing in (0,+) and limt0l(t)=e12,limtl(t)=1, l isdecreasing on (0,). The maximum value of h is h(e)=e12e.

Let t¯ be the solution of the equation

(90) (t+1)12te12e=0.

It can be easily seen that t¯ exists and it is the unique solution for equation (90). For t>t¯,l(t)>e12e, so it is clear that the maximum value of E(γn+1(q),q) can be obtained for qt¯,q. It is easy to prove that t¯(4,5) and t¯4.76. Taking into account the properties of h and l it is clear that in order to determine the greatest value of E(γn+1(q),q) it will be sufficient to consider only those qN verifying 1<q4, and nq1.

Table 2 contains the approximate values of the efficiency indexes corresponding to these values of q and n.

q/n 1 2 3
2 1.2856
3 1.2487 1.2573
4 1.2175 1.2218 1.2226
Table 2.

The highest value for the efficiency index is hence obtained for q=2 and n=1. We shall specify explicitly the method (39) for these values. For this purpose it is convenient

to use the divided differences on multiple nodes. The following table contains the divided differences for the inverse function f1 on the nodes ys=f(xs),ys+1=f(xs+1) having the multiplicity orders 2.

f(x) x [u,v;f1] [u,v,ω;f1] [u,v,ω,z;f1]
ys xs . . .
ys xs [ys,ys;f1] . .
ys+1 xs+1 [ys,ys+1;f1] [ys,ys,ys+1;f1] .
ys+1 xs+1 ys+1,ys+1;f1 ys,ys+1,ys+1;f1 [ys,ys,ys+1,ys+1;f1]
Table 3.

Here < display="inline">[ys,ys;f1]=1f(xs),[ys+1,ys+1;f1]=1f(xs+1),[ys,ys+1;f1]=1[xs,xs+1;f], and the other divided differences are computed using the well-known recurrence formula.

In this case the method has the following form:

1999

Related Posts