Abstract
In this paper we approach two aspects concerning the optimality problems arising from the consideration of the iterative methods for approximating the solutions of equations by inverse interpolation.
The first aspect concerns the construction of some algorithms having optimal convergence orders, while the second addresses the optimal complexity of calculus concerning the inverse interpolation iterative methods.
Authors
Ion Păvăloiu
Tiberiu Popoviciu Institute of Numerical Analysis
Keywords
nonlinear equations in \(\mathbb{R}\); inverse interpolation; convergence order; efficiency index; optimal iterative methods
Cite this paper as:
I. Păvăloiu, Optimal algorithms concerning the solving of equations by interpolation, Research on Theory of Allure, Approximation, Convexity and Optimization, Ed. Srima, Cluj-Napoca (1999), pp. 222-248, ISBN 973-98551-4-3.
About this paper
Journal
Research on Theory of Allure, Approximation, Convexity and Optimization
Publisher Name
Editura Srima
DOI
Not available yet.
Print ISBN
973-98551-4-3
Not available yet.
References
?
Paper (preprint) in HTML form
SÉMINAIRE DE LA THÉORIE DE LA MEILLEURE APPROXIMATION,
CONVEXITÉ ET PROGRAMMATION MATHÉMATIQUE
CLUJ - NAPOCA, 1999, pp. 222–248
Optimal Algorithms Concerning the Solving of Equations by Interpolation
1. Introduction
It is well known that the most usual methods for approximating a solution of a nonlinear equation in (Newton’s method, Chebyshev’s method, chord method and different generalizations of these) are obtained in an unitary manner by Lagrange-Hermite-type inverse interpolation.
The inverse interpolatory polynomials, by a proper choice of the nodes, also lead to Aitken-Steffensen-type methods.
In this paper we approach two aspects concerning the optimality problems arising from the consideration of the iterative methods for approximating the solutions of equations by inverse interpolation. The first aspect concerns the construction of some algorithms having optimal convergence orders, while the second addresses the optimal complexity of calculus concerning the inverse interpolation iterative methods.
We adopt the efficiency index (see [6]) as a measure of the complexity of the iterative methods.
This paper represents a synthesis of the results obtained by us in the papers [3], [4], [7], [10], [11].
We shall begin by presenting some definitions and results (some of them are known) concerning the convergence order and the efficiency index of an iterative method. We briefly present then the inverse interpolatory methods and the iterative methods generated by them. We consider different classes of interpolatory methods determining for each class the methods having the optimal convergence order. Finally, we determine the methods having the optimal efficiency indexes.
2. Convergence orders and efficiency indexes
Denote , and consider the equation
(1) |
where is given. We shall assume for simplicity in the following that the above equation has a unique solution Let be a function having a unique fixed point and let that point be
For approximating the solution we shall consider the elements of the sequence generated by the iterations
(2) |
More general, if is a function of variables whose restriction to the diagonal of coincides with i.e.
then we may consider the iterations
(3) |
The convergence orders of the sequences generated by (2) and (3) depend on some properties of the function resp.
The amount of time needed by a computer to obtain a convenient approximation depends both on the convergence order of and on the number of elementary operations that must be performed at each iteration step in (2) or (3). The convergence order of the methods of the form (2) and (3) may be determined exactly under some circumstances, but the number of elementary operations needed at each iteration step may be hard or even impossible to evaluate. A simplification of this problem may be obtained (see [6]) by taking into account the number of function evaluations needed at each iteration step.
It is obvious that this criterion may be, at the first sight, contested, since some functions may be simpler and others may be more complicated from the calculus viewpoint.
This inconvenient does not affect our viewpoint on optimal efficiency, because it refers on classes of iterative methods which are applied for solving an equation in which the functions are well determined by the form of equation (1), and by resp.
Let be an arbitrary sequence which together with and satisfies
-
i.
and for
-
ii.
the sequence converges and
-
iii.
is derivable at
-
iv.
for any it follows for some , where denotes the first order divided difference of on the nodes and
Definition 2.1.
The sequence has the convergence order with respect
to if there exists the limit
(4) |
and
For a unitary treatment of the convergence orders of the studied methods we shall prove the following lemmas.
Lemma 2.1.
If the sequence and the functions and satisfy properties i–iv then the necessary and sufficient condition for this sequence to have the convergence order is that there exists
(5) |
and
Proof.
∎
Lemma 2.2.
Assume that is a sequence of real positive numbers satisfying the following properties:
-
i.
the sequence is convergent and
-
ii.
there exist the real nonnegative numbers and a sequence with and which together with the elements of the sequence satisfy
(6) -
iii.
there exists
Then is the positive root of the equation
(7)
We turn now our attention to equation of the form (9).
Let ,
We shall assume that the numbers are ordered:
(10) |
and satisfy
(11) |
Consider the equations
(12) |
(13) |
(14) |
where is an arbitrary permutation of
Lemma 2.3.
Proof.
Consider the equations of the form (14) and denote by the largest natural number for which
We have Consider the function It can be seen by (11) that and It follows that equation (14) has a unique positive root. The first part of the lemma is proved. In order to prove inequality (15) it suffices to show that and Indeed,
since from (15) follow the inequalities
and The fact that is shown in an analogous manner. ∎
Lemma 2.4.
Let and where , , be two sets of real numbers satisfying
(16) |
Then, among all the numbers of the form
(17) |
where and are arbitrary permutations of the largest such number is given by
(18) |
Proof.
From the first set of inequalities (16) it follows that the inequality:
(19) | ||||
holds for any two permutations and of
Let us denote
(20) |
In order to prove the inequality
(21) |
for every permutation we shall proceed by induction. For the inequality (21) is obvious, since and hence Suppose now that the inequality is true for pairs of numbers namely
(22) |
where and Using the inequalities and as well as the induction hypothesis (22) and assuming that , we have
∎
We turn back our attention to the equation
(23) |
and we assume that , and . Denote by the positive root of the above equation. The following result holds.
Proof.
Let
(25) |
It is sufficient to prove that where
We shall use for this the inequality between the arithmetic mean and the geometric mean, i.e.
Using this inequality we obtain
i.e. ∎
Remark 2.1.
It can be easily seen that the number given by (25) can be expressed using
The second part of relations (24) follows easily from the inequality where
Some more specific results concerning the bounds for the root of equation (23) can be obtained in the case
(26) |
More precisely, denoting by the positive root of equation
(27) |
then the following relations hold (see [15]):
-
a)
-
b)
-
c)
For from relations a)–c) we get (see [6]):
-
a’)
-
b’)
-
c’)
Definition 2.2.
3. Iterative methods of interpolatory type
In the following we shall briefly present the Lagrange-Hermite-type inverse interpolatory polynomial. It is well known that this leads us to general classes of iterative methods from which, by suitable particularizations we obtain usual methods as Newton’s method, chord method, Chebyshev’s method, etc.
For the sake of simplicity we prefer to treat separately the Hermite polynomial and the Lagrange polynomial, though the last is a particular case of the first.
As we shall see, a suitable choice of the nodes enables us to improve the convergence orders of Lagrange-Hermite-type methods. We shall call such methods Steffensen-type methods.
3.1. Lagrange-type inverse interpolation
Denote by the range of for Suppose is times differentiable and for all It follows that is invertible and there exists Consider interpolation nodes in
(29) |
In the above hypotheses it follows that the solution of equation (1) is given by
Using the Lagrange interpolatory polynomial for the function at the nodes we shall determine an approximation for i.e. for
Denote and let be the mentioned polynomial, which is known to have the form
where
The following equality holds
(30) |
where
and
It is also known that under the mentioned hypotheses concerning the derivability of on the function admits derivatives of any order for all and the following equality holds [12], [16]:
(31) |
where and the above sum extends over all nonnegative integer solutions of the system
From (30), neglecting we obtain the following approximation for
Denoting
we obtain
where
It is clear that if are distinct approximations of the solution of equation (1) then a new approximation can be obtained as above, i.e.
(32) |
with the error estimate given by
(33) |
where belongs to the smallest open interval contianing
If we replace in (33) we obtain for the sequence the relations:
(34) |
where belongs to the open interval determined by and
3.2. Hermite-type inverse interpolation
Consider in the following, besides the interpolation nodes (29), natural numbers where and
We shall suppose here too, for simplicity, that is times differentiable on From this and from for all it follows, by (31), that is also times differentiable on Denoting then the Hermite polynomial for the nodes has the following form:
(35) | ||||
where
If are distinct approximations of the solution of the equation (1), then the next approximation can be obtained as before in the following way:
(36) |
where, as in (35),
It can be easily seen that the following equality holds:
(37) |
where belongs to the smallest open interval containing and belongs to the open interval determined by and
If we suppose that
verify the hypotheses of Lemma 2.2 and, moreover, then it is clear that the convergence order of the method (36) is given by the positive solution of the equation
(38) |
In the following we shall consider a particular case of (36).
For from (36) we obtain
(39) |
method having the convergence order given by the positive solution of the equation
(40) |
3.3. Aitken-Steffensen type iterative methods
Let be functions having the following properties
-
where is the solution of (1);
-
there exist continuous functions and the real numbers such that the following equalities hold:
(41)
Denote an initial approximation of the root of (1). We construct the interpolation nodes in the following way:
(42) |
Next, we compute and we consider the natural numbers such that
Taking as interpolation nodes the numbers and the Hermite interpolatory polynomial determined by these nodes with the corresponding multiplicities we obtain for the following approximation:
(43) |
The error is given by
(44) |
where is a point belonging to the smallest interval determined by the points and while has the following form:
(45) |
Taking into account hypothesis for the functions we get
and in general
(46) |
Denote
and
(47) |
where
With these notations, from (44)–(46) we obtain
(48) |
Let be an arbitrary approximation of the solution obtained by the continuation of the process given by (43). Then the next approximation is constructed in the following way.
Consider the interpolation nodes given by the relations
Then is given by
(49) |
where with the error estimation
(50) |
where is a point belonging to the smallest interval determined by and and has an analogous form with that given in (47) for
It is obvious now that if then the convergence order of the process (49) is where
(52) |
We shall consider in the following the particular case when
We assume that and satisfy
(53) |
where , for all
3.4. Some particular cases
In what follows we shall discuss some particular cases.
The case From (35) one obtains the Taylor inverse interpolating polynomial:
(58) |
while, from (31), we obtain the following expressions for the successive derivatives
(59) |
(60) |
(61) |
(62) |
From (59) and (58) for we obtain:
which, for leads to the approximation of given by the expression
(63) |
i.e., to the Newton’s method.
From (59), (60) and (58) for we obtain Chebyshev’s method, i.e.:
(64) |
Finally, from (59), (60) (61) and (58) for we obtain:
(65) |
From the above methods one obtains by iterations the corresponding sequence of approximations, which has the convergence orders 2, 3 and respectively 4.
As one may notice from (62) and (36), for the expressions for the derivatives have a more complex from. That is why the methods following from (58) in these cases are also complex.
The case In this case, from (35) it follows:
(66) |
where:
(67) |
From (66) one obtains two iterative methods; namely denoting as above by the Hermite inverse interpolating polynomial (66), we find:
(68) |
or
(69) |
The characteristic equations which provide the convergence orders for the two methods are:
(70) |
for method (68), and:
(71) |
for the method (69).
If we denote by and respectively , the positive roots of equations (70) and (71), then it is clear that implies ; so, the method with optimal convergence order is the method (68).
Now, we shall briefly discuss some particular cases.
From (66), for we obtain
(72) |
whence, taking into account the fact and we find for
(73) |
where stands for the first order divided difference of the function on the nodes and and in general,
(74) |
which is the chord method. In this case, since the above method has the same convergence order as the other one, which follows from (74), i.e.:
(75) |
The convergence order of the method (74) is
Now we shall discuss the case In this particular case, we obtain from (66) the following iterative methods:
(76) | ||||
and
(77) |
Solving the corresponding characteristic equations, we find the convergence orders the method (76) and for the method (77).
As we showed above, the Hermite inverse interpolating polynomial leads to a large class of iterative methods. The convergence order of each method depends on the number of interpolating nodes, the order of multiplicity of these ones, and essentially, on the interpolating node replaced at each iteration step by that calculated at the previous one.
As Steffensen noticed, in the case of method (74), the convergence order of this method can be increased if at each iteration step the element depends in a certain manner on More exactly, if we consider a function having the property where is the root of the equation (1), and if we put into (74), then we obtain the sequence generated by Steffensen’s method:
which has, as it is well known, the convergence order 2.
4. Optimal convergence order
4.1. Optimal convergence order of the iterative methods of Hermite type
As we have seen in the particular cases presented at §3.4, in the case the Hermite inverse interpolation polynomial for leads to two different iterative methods (see (68) and (69)). From these two, methods (68) has a convergence order greater than the other one. In the following we shall use Lemma 2.3 in order to generalize the iterative methods (68) and (69). It is clear that the convergence order of method (36) depends on the multiplicity of the interpolation nodes which are replaced at each iteration step such that we are led to different configurations of the coefficients in equation (38). Formula (36) generates iterative methods, with respect to the algorithm of changing the interpolation nodes at each iteration step. Among those methods, we shall determine in the following the method with the highest convergence order, i.e. the optimal method. For this purpose we shall do as follows.
Consider the permutation of the numbers for which the natural numbers satisfying the equality can be increasingly ordered, namely:
(78) |
We renumber, accordingly, the elements of the set i.e. we consider:
For the sake of clearness we shall set:
(79) |
and
(80) |
and denote by the Hermite interpolating polynomial, corresponding to the nodes , having the multiplicities respectively.
Let be the initial approximation of the root of the equation (1). We construct the sequence by means of the following iterative procedure:
(81) |
Consider all permutations of the set To each permutation it corresponds an iterative method of the form:
(82) |
All together we have iterative methods.
Taking into account Lemma 2.3 and the results proved so far, we can state the following theorem:
Theorem 4.1.
Out of the iterative methods of the form (82), with the greatest convergence order (namely these which provide the best upper limit for the absolute value of the error) is that determined by the permutation which orders increasingly the numbers namely
4.2. Aitken-Steffensen-type optimal methods
In the following we shall solve an optimization problem, analogous with the one treated at section 4.1, but now for the case of the Aitken-Steffensen method.
We shall consider the methods of type (49), and for determining the optimal algorithm we shall rely on Lemma 2.4.
Let and be two arbitrary permutations of numbers Also denote
the Hermite inverse interpolating polynomial having the interpolating notes with the orders of multiplicity
With the above notations, let us consider the following class of iterative methods
(83) |
where
and
(84) | ||||
being the given initial approximation.
To each couple of permutations and of the numbers there corresponds an iterative method of the form (83). All together we have again iterative methods of this form.
We shall attempt to determine, out of the iterative methods , that one for which the number given by (52) is maximum.
Theorem 4.2.
5. Optimal efficiency
We shall analyse in the following the efficiency index of each of the methods described and in the hypotheses adopted below we shall determine the optimal methods, i.e. those having the highest efficiency index.
As we have seen, the formulae for computing the derivatives of have a complicated form and they depend on the successive derivatives of Though, in the case where the orders of the derivatives of are low, the values of these derivatives are obtained by only a few elementary operations. Taking into account the generality of the problem we shall consider each computation of the values of any derivative of by (31) as a single function evaluation. For similar reasons we shall also consider each computation of the inverse interpolatory polynomials as a single function evaluation.
As it will follow from our reasonings, the methods having the optimal efficiency index are generally the simple ones, using one or two interpolation nodes and the derivatives of up to the second order.
Remark that in our case we can use for the efficiency index relation (28).
5.1. Optimal Chebyshev-type methods
Observe that for passing from the iteration step to the in method (85) the following evaluations must be performed:
i.e. values.
Then, by (31), we perform the following function evaluations:
where Finally, for the right-hand expression of relation (85) we perform another function evaluation, so that function evaluations must be performed.
By (28) the efficiency index of method (85) has the form
Considering the function we observe that it attains its maximum at so that the maximum value of is attained for We have proved the following result:
Theorem 5.1.
Among the Chebyshev-type iterative methods having the form (85), the method with the highest efficiency index is the third order method, i.e.
(86) |
In the following table some approximate values of are listed:
We note that
5.2. The efficiency of Lagrange-type methods
We shall study the methods of the form (32), for which the convergence order verifies a’)–c’) from §2. Taking into account Remark 2.1, it can be easily seen that we can use relation (28) for the efficiency index of these methods. For each step in (32) in order to pass to the next step, only must be evaluated, the other values from (32) being already computed. We have also another function evaluation in computing the right-hand side of relation (32). So two function evaluations are needed. Taking into account that the convergence order of each method satisfies a’)–c’), and denoting by the corresponding efficiency index, we have
and
We have proved:
Theorem 5.2.
For the class of iterative methods of the form (32) the efficiency index is increasing with respect to the number of interpolation nodes, and we have the equality
5.3. Optimal Hermite-type particular methods
We shall study the class of iterative methods of the form (39) for
Taking into account Remark 2.2 it is clear that we can use again relation (28) for the efficiency index.
If is an approximation for the solution obtained by (39) then for passing to the following iteration step we need
i.e. function evaluations. Then, by (31) we must compute the derivatives of the inverse function where Another function evaluation is needed for computing the right-hand side of relation (39). We totally have function evaluations, the other values in (39) being already computed.
By a)–c) from Remark 2.2 and denoting by the efficiency of methods, of the form (39), we get:
(87) |
(88) |
For a fixed by (87) it follows that the efficiency index is an increasing function with respect to and
In the following we shall study as a function of and
Some elementary considerations show that and satisfy is increasing on and decreasing in and isdecreasing on The maximum value of is
Let be the solution of the equation
(90) |
It can be easily seen that exists and it is the unique solution for equation (90). For so it is clear that the maximum value of can be obtained for It is easy to prove that and Taking into account the properties of and it is clear that in order to determine the greatest value of it will be sufficient to consider only those verifying and
Table 2 contains the approximate values of the efficiency indexes corresponding to these values of and
The highest value for the efficiency index is hence obtained for and We shall specify explicitly the method (39) for these values. For this purpose it is convenient
to use the divided differences on multiple nodes. The following table contains the divided differences for the inverse function on the nodes having the multiplicity orders 2.
. | . | . | ||
. | . | |||
. | ||||
Here < display="inline">