Return to Article Details Accelerating the convergence of Newton-type iterations

Accelerating the convergence of Newton-type iterations\(^\ast \)

T. Zhanlav\(^\S \), O. Chuluunbaatar\(^{\S ,\bullet }\) , V. Ulziibayar\(^{\S ,\ast }\)

January 2, 2017.

\(^\S \)Institute of Mathematics, National University of Mongolia, Mongolia, e-mail: tzhanlav@yahoo.com

\(^\bullet \)Joint Institute for Nuclear Research, Dubna, 141980 Moscow region, Russia, e-mail: chuka@jinr.ru

\(^\ast \)School of Applied Sciences, Mongolian University of Science and Technology, Mongolia, e-mail: ulzii@jinr.ru

In this paper, we present a new accelerating procedure in order to speed up the convergence of Newton-type methods. In particular, we derive iterations with a high and optimal order of convergence. This technique can be applied to any iteration with a low order of convergence. As expected, the convergence of the proposed methods is remarkably fast. The effectiveness of this technique is illustrated by numerical experiments.

MSC. 65H05

Keywords. Newton-type iterations, accelerating procedure, convergence order, efficiency index

1 Introduction

As it is known, the monotone approximations for the solutions of nonlinear equations in \(\mathbb R\) is interesting not only from theoretical, but also from practical view points. In particular, two-sided approximations can be efficiently used as a posteriori estimations for the errors in approximating the desired solution. It means that one can control the error at each iteration step. In the last decade, many authors have developed new monotone iterative methods [ 9 , 19 , 18 ] . The main advantage of the monotone iterations is that it does not require good initial approximations contrary to what occurs in the other iteration methods, such as secant-like methods, Newton’s methods and others [ 4 ] . On the other hand, accelerating the convergence of iterative methods is also of interest both from theoretical and computational view points [ 1 , 2 , 3 , 10 , 12 , 14 , 5 , 11 , 13 , 16 ] . For example, in [ 4 ] was constructed a family of the predictor-corrector iterative method from the simplified secant method and a family of secant-like methods; the authors analyzed the initial conditions on the starting point in order to improve the semilocal convergence of the method. In general, it is desirable to choose the starting point from the convergence domain [ 15 , 16 , 19 ] .

In recent years, many iterative methods for solving nonlinear equations have been developed [ 1 , 2 , 3 , 10 , 12 , 14 , 5 , 11 , 13 , 16 ] to improve the local order of convergence of some methods such as Newton, Ostrowski, Potra-Ptak’s methods and so on. The most efficient methods studied in the literature are the optimal eighth-order methods with four function evaluations per iteration, see [ 1 , 2 , 3 , 10 , 12 ] and references therein. The methods developed in [ 1 , 12 , 7 ] are based on optimal Ostrowski’s or King’s method and use arbitrary real parameters and weight functions. The methods proposed in [ 2 , 3 , 10 ] are obtained by composing an iterative method proposed by Chun and Potra-Ptak’s method with Newton’s method.

In this paper we propose a new accelerating procedure for Newton-type methods. By virtue of this procedure, we obtain a higher order, in particular optimal order methods. The usage of the optimal choice of parameter allows us to improve the convergence speed. This may be also helpful in order to extend the domain of convergence.

The paper is organized as follows. Section 2 describes monotone and two-sided approximations. In Section 3, we show the accelerating procedure and establish a convergence order of the new proposed methods. Section 4 is devoted to finding an optimal parameter in the proposed iterations. Finally, Section 5 presents various numerical examples which confirm the theoretical results, and a numerical comparison with other existing optimal order methods.

2 Statement of the problems

Let \(a, b\in \mathbb {R},\, a{\lt}b,\, f:[a, b]\rightarrow \mathbb {R}\) and consider the following nonlinear equation

\begin{eqnarray} f(x)=0. \label{eq1} \end{eqnarray}

Assume that \(f(x)\in \mathcal{C}^4[a,b],\, f'(x)\neq 0,\, x\in (a,b)\) and Eq. (2.1) has a unique root \(x^*\in (a,b)\). In [ 19 , 18 ] the following iterations were proposed:

\begin{eqnarray} \label{eq2} & & x_{2n+1}=x_{2n}-\tau _n\tfrac {f(x_{2n})}{f’(x_{2n})}, \@bsphack \if@filesw {\let\thepage \relax \def\protect {\noexpand \noexpand \noexpand }\edef\@tempa {\write \@auxout {\string \newlabel {eq2a}{{\thesubequation }{\thepage }}}}}\write\@auxout {\string \newlabel {eq2a}{{\thesubequation }{\thepage }}}\if@nobreak \fi \fi \@esphack \\ & & x_{2n+2}=x_{2n+1}-\tfrac {f(x_{2n+1})}{f’(x_{2n+1})}, \quad n=0,1,\ldots \@bsphack \if@filesw {\let\thepage \relax \def\protect {\noexpand \noexpand \noexpand }\edef\@tempa {\write \@auxout {\string \newlabel {eq2b}{{\thesubequation }{\thepage }}}}}\write\@auxout {\string \newlabel {eq2b}{{\thesubequation }{\thepage }}}\if@nobreak \fi \fi \@esphack \end{eqnarray}

In [ 19 ] it is shown that the iterations (??) and (??) monotone convergent under conditions

\begin{eqnarray} 0{\lt}\tau _n\leq 1,\quad a_{n}=\tfrac {M_2|f(x_n)|}{(f’(x_n))^2}{\lt}\tfrac 12,\quad M_2=\sup _{x\in U_r(x^*)}|f”(x)|, \label{eq3} \end{eqnarray}

and under the assumption H : \(f'(x)\neq 0\), \(f''(x)\) preserve sign in the small neighborhood \(U_r(x^*)=\{ x:|f(x)|{\lt}r\} .\) However the iterations (??) and (??) are not equipped with a suitable choice of parameter \(\tau _n\). In [ 18 ] it was shown that the iterations (??) and (??) have a two-sided approximation behavior under conditions

\begin{eqnarray} & & \tau _n\in I_{2n}=\Big[t\tfrac {1-\sqrt{1-2a_{2n}}}{a_{2n}}, \tfrac {-1+\sqrt{1+4a_{2n}}}{a_{2n}}\Big)\subseteq (1, 2), \quad a_{2n}{\lt}\tfrac 49.\label{eq4} \end{eqnarray}

It is also proved that the convergence rate of these iterations is at least 2, and the order of convergence is increased up to 4, when \(\tau _n\to 1\). From this it is clear that the accelerating of the convergence of iterations (??) and (??) is important, especially at the early stage of iterations.

3 Monotone and two-sided convergence of iterations
and their acceleration

If \(\tau _n\equiv 1\), then the iterations (??) and (??) become as Newton’s one

\begin{eqnarray} x_{n+1}=x_n-\tfrac {f(x_n)}{f’(x_n)},\quad n=0,1\ldots \label{eqn5} \end{eqnarray}

According to [ 19 ] the iteration (3.6) is a monotone convergent under condition (2.4) and assumption H.

Let \(\theta _n=\displaystyle \tfrac {f(x_{n+1})}{f(x_n)}\). Then the Taylor expansion of \(f(x_{n+1})\) gives

\begin{eqnarray} 0{\lt}\theta _n\leq \tfrac {a_n}{2}{\lt}\tfrac 14.\label{eqn6} \end{eqnarray}

Now we proceed to accelerate the convergence of monotone iteration (3.6). To this end, we use two known approximations \(x_n\), \(x_{n+m}\) satisfying either \(x_{n}{\lt}x_{n+m}{\lt}x^*\) or \(x^*{\lt}x_{n+m}{\lt}x_n\) and consider

\begin{eqnarray} y_n=x_n+t(x_{n+m}-x_n),\quad t{\gt}1.\label{eq13} \end{eqnarray}

From (3.8) it is clear that \(y_n\) belongs to interval connecting \(x_n\) and \(x_{n+m}\) under condition \(0\leq t\leq 1\). Hence, the extrapolating approach corresponds to the case \(t{\gt}1\). Our aim is to find the optimal value of \(t=t_{\textnormal{opt}}\) in (3.8) such that the new approximation \(y_n\) given by (3.8) will be situated more close to \(x^*\) as compared with \(x_n\) and \(x_{n+m}\). We use Taylor expansion of the smooth function \(f(x)\in \mathcal{C}^{k+1}[a,b]\):

\begin{eqnarray} & & f(y_n)=f(x_n)+f’(x_n)t(x_{n+m}-x_n)+\ldots \nonumber \\ & & \quad \hspace{1.2cm}+\tfrac {f^{(k)}(x_n)}{k!}t^k(x_{n+m}-x_n)^k+\mathcal{O}\big((x_{n+m}-x_n)^{k+1}\big).\label{eq14} \end{eqnarray}

Neglecting small term \(\mathcal{O}\big((x_{n+m}-x_n)^{k+1}\big)\) in (3.9), we have

\begin{equation} f(y_n)\approx f(x_n)+f'(x_n)t(x_{n+m}-x_n)+\ldots +\tfrac {f^{(k)}(x_n)}{k!}t^k(x_{n+m}-x_n)^k\equiv P_k(t).\label{eq15} \end{equation}
3.10

From (3.10) it is clear that

\begin{eqnarray} f(x_n)=P_k(0).\label{eq16} \end{eqnarray}

We also require that

\begin{eqnarray} f(x_{n+m})=P_k(1).\label{eq17} \end{eqnarray}

From (3.12) we find \(\displaystyle \tfrac {f^{(k)}(x_n)}{k!}(x_{n+m}-x_n)^k\) and substituting it into (3.10), we get \(P_k(t)\). From this we find \(t{\gt}1\) such that

\begin{eqnarray} f(y_n)\approx P_k(t)=0.\label{eq18} \end{eqnarray}

Geometrically, (3.10) means that the graph (plot) of \(f(x)\) in the vicinity of root \(x^*\) is replaced by curve \(P_k(t)\) passing through the given points \((x_n,f(x_n))\) and \((x_{n+m},f(x_{n+m}))\). Thus, Eq. (3.10) for \(k=1,2\) and \(k=3\) gives us

\begin{eqnarray} \label{eq19} P_0(t)& =& f(x_n)\approx f(y_n),\quad y_n=x_n,\@bsphack \if@filesw {\let\thepage \relax \def\protect {\noexpand \noexpand \noexpand }\edef\@tempa {\write \@auxout {\string \newlabel {eqn19a}{{\thesubequation }{\thepage }}}}}\write\@auxout {\string \newlabel {eqn19a}{{\thesubequation }{\thepage }}}\if@nobreak \fi \fi \@esphack \\ P_1(t)& =& f(x_n)+\left( f(x_{n+m})-f(x_n) \right)t,\@bsphack \if@filesw {\let\thepage \relax \def\protect {\noexpand \noexpand \noexpand }\edef\@tempa {\write \@auxout {\string \newlabel {eq19a}{{\thesubequation }{\thepage }}}}}\write\@auxout {\string \newlabel {eq19a}{{\thesubequation }{\thepage }}}\if@nobreak \fi \fi \@esphack \\ P_2(t)& =& f(x_n)+f’(x_n)\left(x_{n+m}-x_n\right)t\nonumber \\ & & +\left(f(x_{n+m})-f(x_n)-f’(x_n)(x_{n+m}-x_n)\right)t^2,\@bsphack \if@filesw {\let\thepage \relax \def\protect {\noexpand \noexpand \noexpand }\edef\@tempa {\write \@auxout {\string \newlabel {eq19b}{{\thesubequation }{\thepage }}}}}\write\@auxout {\string \newlabel {eq19b}{{\thesubequation }{\thepage }}}\if@nobreak \fi \fi \@esphack \\ P_3(t)& =& f(x_n)+f’(x_n)\left(x_{n+m}-x_n\right)t+\tfrac {f”(x_n)}{2}(x_{n+m}-x_n)^2t^2\nonumber \\ & & +\Big(f(x_{n+m})-f(x_n)-f’(x_n)(x_{n+m}-x_n) \\ & & -\tfrac {f”(x_n)}{2}(x_{n+m}-x_n)^2\Big)t^3, \@bsphack \if@filesw {\let\thepage \relax \def\protect {\noexpand \noexpand \noexpand }\edef\@tempa {\write \@auxout {\string \newlabel {eq19c}{{\thesubequation }{\thepage }}}}}\write\@auxout {\string \newlabel {eq19c}{{\thesubequation }{\thepage }}}\if@nobreak \fi \fi \@esphack \nonumber \end{eqnarray}

respectively. Thus, the parameter \(t\) in (3.8) is calculated as a root greater than 1 of Eq. (3.13). In particular, for \(k=1\), we have

\begin{eqnarray} t_{\textnormal{opt}}=\tfrac {f(x_n)}{f(x_n)-f(x_{n+m})}{\gt}1.\label{eq20a} \end{eqnarray}

Since \(P_k(0)P_k(1)=f(x_n)f(x_{n+m}){\gt}0\) for \(k\geq 1\), Eq. (3.13) may have at least one root satisfying the condition \(t^*{\gt}1\). From (3.7) it follows that

\begin{eqnarray} |f(x_{n+1})|{\lt}\tfrac 14|f(x_n)|.\label{eqn7} \end{eqnarray}

Therefore, it is desirable to choose \(n\) and \(m\) such that

\begin{eqnarray} |f(x_{n+m})|{\lt}\big(\tfrac 14\big)^m|f(x_n)|\ll 0.1.\label{eqn8} \end{eqnarray}

This inequality is written in term of \(P_k (t)\) as

\begin{eqnarray} |P_k(1)|\leq \big(\tfrac 14\big)^m|P_k(0)|{\lt}0.1. \end{eqnarray}

On the other hand, from (3.10) we see that \(P'_k(1)\) is not equal to 0 under the assumption H, i.e. \(t=1\) is not a critical point of \(P_k(t)\). Thus, \(P_k(t)\) is decreasing around \(t=1\). Therefore, there exists \(t_{\textnormal{opt}}{\gt}1\) such that \(P_k(t_{\textnormal{opt}})=0\).

Lemma 3.1

Assume that \(f\in \mathcal{C}^4[a,b]\), the assumption H is satisfied and

\begin{eqnarray} |x^*-x_n|=\varepsilon _n{\lt}1.\label{eql9} \end{eqnarray}

Then the following holds

\begin{eqnarray} t_{\textnormal{opt}}-1=\mathcal{O}(\varepsilon _n).\label{eql10} \end{eqnarray}
Proof â–¼
First of all, let us note that the inequality (3.22) is equivalent to
\begin{eqnarray} |f(x_n)|=\mathcal{O}(\varepsilon _n),\label{eql11} \end{eqnarray}

which follows from the expansion

\begin{eqnarray} 0=f(x^*)=f(x_n)+f’(\xi )(x^*-x_n).\nonumber \end{eqnarray}

Of course, \(|x^*-x_{n+m}|{\lt}\varepsilon _n\) and \(|x_{n+m}-x_n|{\lt}\varepsilon _n\) under (3.22).

We also use an analogous expansion for \(P_k(t)\)

\begin{eqnarray} 0=P_k(t_{\textnormal{opt}})=P_k(1)+P’_k(\eta )(t_{\textnormal{opt}}-1),\quad \eta \in (1,t_{\textnormal{opt}}).\label{eql12} \end{eqnarray}

Since \(P_k(t)\) is decreasing around \(t=1\), then \(P'_k(\eta )\neq 0\).

Hence, from (3.24), (3.25) and (3.20) we conclude that

\begin{eqnarray} t_{\textnormal{opt}}-1=-\tfrac {f(x_{n+m})}{P’_k(\eta )}\approx \mathcal{O}(\varepsilon _n).\label{eql13} \end{eqnarray}

The Lemma is proved.

Proof â–¼

Note that in [ 16 ] the iterations was proposed:

\begin{eqnarray} \label{eq37} & & x_{2n+1}=x_{2n}-\tau _n\tfrac {f(x_{2n})}{f’(x_{2n})}, \@bsphack \if@filesw {\let\thepage \relax \def\protect {\noexpand \noexpand \noexpand }\edef\@tempa {\write \@auxout {\string \newlabel {eq37a}{{\thesubequation }{\thepage }}}}}\write\@auxout {\string \newlabel {eq37a}{{\thesubequation }{\thepage }}}\if@nobreak \fi \fi \@esphack \\ & & x_{2n+2}=x_{2n+1}-\tfrac {x_{2n+1}-x_{2n}}{f(x_{2n+1})-f(x_{2n})}f(x_{2n+1}), \quad n=0,1,\ldots ,\@bsphack \if@filesw {\let\thepage \relax \def\protect {\noexpand \noexpand \noexpand }\edef\@tempa {\write \@auxout {\string \newlabel {eq37b}{{\thesubequation }{\thepage }}}}}\write\@auxout {\string \newlabel {eq37b}{{\thesubequation }{\thepage }}}\if@nobreak \fi \fi \@esphack \end{eqnarray}

which has a third order of convergence when \(\tau _n=1\) or \(\tau _n\) tends to 1. It is easy to show that the iteration (??) coincides fully with our acceleration procedure (3.8) and (3.18) with \(m=1\) and \(k=1\). Therefore, one can expect a high acceleration when \(k=2,3\) for Newton’s method.

How to accelerate the convergence rate of iteration (3.6)? The answer of this question gives the following theorem.

Theorem 3.2

Assume \(f(x)\in \mathcal{C}^{k+2}\) and the condition (3.22) is satisfied. Then for \(y_n\) with \(t_{\textnormal{opt}}\) we have

\begin{eqnarray} \tfrac {|x^*-y_n|}{|x^*-x_n|^{k+2}}\approx \mathcal{O}(1),\label{eqn28} \end{eqnarray}

where \(\mathcal{O}\) is the Landau symbol.

Proof â–¼
Let
\begin{eqnarray} & & x^*=x_n+t^*(x_{n+m}-x_n),\quad t^*\geq 1,\nonumber \\ & & y_n=x_n+t_{\textnormal{opt}}(x_{n+m}-x_n).\label{eqn30} \end{eqnarray}

We use Taylor expansions of \(f(x)\in \mathcal{C}^{k+2}\)

\begin{align} 0=& f(x^*)= \label{eqn32} \\ =& \sum \limits ^k_{p=0}\tfrac {f^{(p)}(x_n)}{p!}(t^*)^p(x_{n+m}-x_n)^p \nonumber \! +\! \tfrac {f^{(k+1)}(\eta _n)}{(k+1)!}(t^*)^{(k+1)}(x_{n+m}\! -\! x_n)^{(k+1)}, \nonumber \end{align}
\begin{align} & f(x_{n+m})-\sum \limits ^{k-1}_{p=0}\tfrac {f^{(p)}(x_n)}{p!}(x_{n+m}-x_n)^p=\label{eqn34}\\ & =\tfrac {f^{(k)}(x_n)}{k!}(x_{n+m}-x_n)^k+ \tfrac {f^{(k+1)}(\xi _n)}{(k+1)!}(x_{n+m}-x_n)^{k+1},\nonumber \end{align}

and

\begin{align} 0=P_k(t_{\textnormal{opt}})=& \sum \limits ^{k-1}_{p=0}\tfrac {f^{(p)}(x_n)}{p!}(t_{\textnormal{opt}})^p(x_{n+m}-x_n)^p \label{eqn33} \\ & +\Big(f(x_{n+m})-\sum \limits ^{k-1}_{p=0}\tfrac {f^{(p)}(x_n)}{p!}(x_{n+m}-x_n)^p\Big)(t_{\textnormal{opt}})^k, \nonumber \end{align}

where \(\eta _n\in (x_n,x^*)\) and \(\xi _n\in (x_n,x_{n+m})\). Using (3.31) in (3.32) and subtracting (3.32) from (3.31) we get

\begin{eqnarray} & & \left[f’(x_n)+\tfrac {f”(x_n)}{2}(x_{n+m}-x_n)(t^*+t_{\textnormal{opt}})+\tfrac {f”’(x_n)}{6}(x_{n+m}-x_n)^2\right.\nonumber \\ & & (t^{*^2}+t^*t_{\textnormal{opt}}+t^2_{\textnormal{opt}})+\cdots +\tfrac {f^{(k)}(x_n)}{k!}\left(t^{*^{k-1}}+t^{*^{k-2}}t_{\textnormal{opt}}+\cdots + t_{\textnormal{opt}}^{k-1}\right)\nonumber \\ & & \left.(x_{n+m}-x_n)^{k-1}\right](t^*-t_{\textnormal{opt}})=\nonumber \\ & & =-\tfrac {(x_{n+m}-x_n)^k}{(k+1)!}\left(f^{(k+1)}(\eta _n)t^{*^{k+1}}-f^{(k+1)}(\xi _n)t^k_{\textnormal{opt}}\right).\label{eqn35} \end{eqnarray}

Since \(f'(x_n)\neq 0\), then from last expression we deduce that

\begin{eqnarray} t^*-t_{\textnormal{opt}}=\mathcal{O}(\varepsilon ^k_n).\label{eqn36} \end{eqnarray}

It is possible to derive a more precise estimation than (3.34). Indeed, using (3.34) and \(f\in \mathcal{C}^{k+2}\) we evaluate

\begin{align} A_n& =f^{(k+1)}(\eta _n)t^{*^{k+1}}-f^{(k+1)}(\xi _n)t^k_{\textnormal{opt}}\nonumber \\ & =f^{(k+1)}(\xi _n)(t^{*^{k+1}}-t^k_{\textnormal{opt}})+f^{(k+2)}(\omega _n)(\eta _n-\xi _n). \end{align}

By definition we have

\begin{eqnarray} |\eta _n-\xi _n|\leq |x^*-x_n|=\varepsilon _n. \end{eqnarray}

Using (3.23) and (3.34) we have

\begin{eqnarray} & & t^{*^{k+1}}-t^k_{\textnormal{opt}}=\big(t_{\textnormal{opt}}+\mathcal{O}(\varepsilon ^k_n)\big)^{k+1}-t^k_{\textnormal{opt}}\nonumber \\ & & \qquad = t^k_{\textnormal{opt}}\Bigl(t_{\textnormal{opt}}(1+\mathcal{O}(\varepsilon ^k_n))^{k+1}-1\big)=t^k_{\textnormal{opt}}\big(t_{\textnormal{opt}}+\mathcal{O}(\varepsilon ^k_n)-1\Bigr) =\mathcal{O}(\varepsilon _n). \end{eqnarray}

Then \(A_n=\mathcal{O}(\varepsilon _n)\) and thereby from (3.33) we get

\begin{eqnarray} t^*-t_{\textnormal{opt}}=\mathcal{O}(\varepsilon ^{k+1}_n).\label{eqn37} \end{eqnarray}

Hence, from (3.30) and (3.31) we find that

\begin{eqnarray} x^*-y_n=\mathcal{O}(\varepsilon ^{k+2}_n).\nonumber \end{eqnarray}

which proves (3.29).

Proof â–¼
The sequence \(\{ y_n\} \) given by formula (3.8) can be considered as a new a iteration. For it we have the followlng:

Theorem 3.3

Assume \(f(x)\in \mathcal{C}^{k+1}\) and the convergence order of iterations (3.6) equal to 2 i.e., the following holds

\begin{eqnarray} |x^*-x_n|\leq Mq^{2^n}|x^*-x_0|,\quad 0{\lt}q{\lt}1,\quad M={\rm const.}\label{eq21} \end{eqnarray}

If the equation (3.13) has at least one root \(t_{\textnormal{opt}}\), greater than 1, then the convergence order of new iteration (3.8) is the same as (3.6) and we have

\begin{equation} |x^*-y_n|\leq M_1q_1^{2^n}|x^*-y_0|,\quad 0 3.40

Proof â–¼

By virtue of (3.39) the condition (3.22) is satisfied for large \(n\). Then by Theorem 3.2, the relation (3.29) holds. Using (3.39) in (3.29), we get

\begin{align} |x^*-y_n|& \leq C\big(q^{d^n}\big)^{k+2}|x^*-x_0|^{k+2}=C\big(q^{k+2}\big)^{d^n}|x^*-x_0|^{k+2}\nonumber \\ & =Cq_1^{d^n}|x^*-x_0|^{k+2}\leq M_1 q_1^{d^n}|x^*-y_0|,\quad q_1=q^{k+2}{\lt}q{\lt}1. \end{align}

The proof is completed.

Proof â–¼
Theorem 3.3 shows that the convergence order of iteration (3.8) is the same as iteration (3.6).

However, the speed of convergence of these iterations depends on the factor \(q_1\) and \(q\) in (3.39) and (3.40), respectively. Since \(q_1=q^{k+2}{\lt}q\) for \(k=1,2,3\), one can expect a more rapid convergence of iteration (3.8). Of course, the higher is acceleration of iteration attained at \(k=3\).

From (3.39) and (3.40) it is clear that the iteration (3.8) converges to \(x^*\) more rapidly than iteration (3.6) by virtue of \(q_1=q^{k+1}{\lt}q\). This accelerating procedure is useful, especially at the beginning of iterations, but under condition (3.22). From Theorem 3.3, it is clear that the sequence \(\{ y_n\} \) given by (3.8) together with (3.6) can be considered as a new iteration process with a small factor compared to (3.6). The acceleration procedure is achieved without additional calculations, so that the iteration (3.8) possesses a high computational efficiency. However, despite the sequence \(x_n\) is monotone, the new iteration (3.8) may not be monotone. For instance, when \(k=1\) it is easy to show that

\begin{eqnarray} f(y_n)=\tfrac {f”(\xi _n)}{2}(x_{n+m}-x_n)^2. \end{eqnarray}

From this it is clear that

\begin{eqnarray} f(y_n){\gt}0\quad \textnormal{ if} \quad f”(x){\gt}0, \end{eqnarray}

and

\begin{eqnarray} f(y_n){\lt}0\quad \textnormal{ if} \quad f”(x){\lt}0. \end{eqnarray}

Let us know two successive approximations \(x_{n}\) and \(x_{n+1}\), for which

\begin{eqnarray} f(x_{n})f(x_{n+1}){\lt}0\label{eq32} \end{eqnarray}

holds. We consider

\begin{eqnarray} y_n=x_{n}+t(x_{n+1}-x_{n}),\quad 0\leq t\leq 1.\label{eq33} \end{eqnarray}

The acceleration technique will be the same as a previous case with \(m=1\). In this case, according to (3.45) we have \(P_k(0)P_k(1)=f(x_{n})f(x_{n+1}){\lt}0\) for \(k=1,2,3.\) Hence, Eq. (3.13) has a root \(t_{\textnormal{opt}}\in (0,1)\). Obviously, the new approximation

\begin{eqnarray} y_n=x_{n}+t_{\textnormal{opt}}(x_{n+1}-x_{n}),\quad 0\leq t\leq 1,\label{eq34} \end{eqnarray}

will be situated more close to \(x^*\) as compared to \(x_{n}\), \(x_{n+1}\) and Theorem 3.2 holds true for this case, too. It indicates that the two-sided approximations are useful not only for estimations of roots, but also for finding it approximately with a higher accuracy. Of course, the acceleration procedure (3.8) can be continued further with \(x_{n+m}:=y_n\), \(x_n:=x_{n+m}\) and with \(t{\gt}1\) if \(y_n\) and \(x_{n+m}\) are located on side of \(x^*\) and with \(t\in (0,1)\) if \(y_n\) and \(x_{n+m}\) are located on two-sides of root. Note that the accelerating procedure (3.8) is applicable not only for iterations (3.6), but also for any iteration, in particular, for the following iterations (A), (B), (C) and (D).

Now we consider the accelerated iteration

\begin{equation} \tag {A} y_n=x_n-\tfrac {f(x_n)}{f’(x_n)},\quad x_{n+1}=x_n+t_{\textnormal{opt}}(y_n-x_n),\quad n=0,1,\ldots \end{equation}
3.48

The iteration (A) is a damped Newton’s method [ 20 , 17 ] with optimal parameter \(\tau _n=t_{\textnormal{opt}}\). The first step \(y_n\) is used for finding the optimal parameter.

Theorem 3.4

Assume that the assumptions of Theorem 3.2 are fulfilled. Then the convergence order of iteration (A) with optimal \(t_{\textnormal{opt}}\) is \(d=k+2,\, k=1,2,3\), depending on the smoothness of \(f\).

Proof â–¼
If we compare (A) with (3.6) and (3.8), then \(x_{n+m}:=y_n\) and \(y_n:=x_{n+1}\). Therefore, the expression (3.29) in the Theorem 3.2 has a form
\begin{eqnarray} \tfrac {|x^*-x_{n+1}|}{|x^*-x_n|^{k+2}}=\mathcal{O}(1)\Longleftrightarrow |x^*-x_{n+1}|\leq M|x^*-x_n|^{k+2}, \end{eqnarray}

which completes the proof of the Theorem 3.4.

Proof â–¼

Now let us consider another three-step iteration

\begin{align} y_n=& x_n-\tfrac {f(x_n)}{f’(x_n)},\quad z_n=y_n-\tfrac {f(y_n)}{f’(x_n)},\nonumber \\ \tag {B} x_{n+1}=& y_n+t(z_n-y_n),\quad n=0,1,\ldots \end{align}

Note that if \(t\equiv 1\) in (B), then it leads to

\begin{equation} \tag {B$^\prime $} y_n=x_n-\tfrac {f(x_n)}{f’(x_n)},\quad x_{n+1}=y_n-\tfrac {f(y_n)}{f’(x_n)},\quad n=0,1,\ldots \end{equation}
3.51

The iteration (\(B'\)) is a particular case of scheme (40) given in [ 16 ] with \(\sigma =0\) end \(\tau =1\) and has a third order of convergence. Therefore, the iteration (B) can be considered as improvement of iteration (B\('\)).

Theorem 3.5

The assumptions of the Theorem 3.2 are fulfilled. Then the convergence order of iteration (B) with optimal \(t_{\textnormal{opt}}\) equal to \(d=2k+3\).

Proof â–¼
If we compare (B) with (3.8), then \(x_n:=y_n\), \(x_{n+m}:=z_n\), \(y_n:=x_{n+1}\). Then form (3.33) and (3.24) we get
\begin{eqnarray} t^*-t_{\textnormal{opt}}=MA_n(z_n-y_n)^k\approx \mathcal{O}(\varepsilon ^{2k+1}_n),\label{eqn38} \end{eqnarray}

where

\begin{eqnarray} & & x^*=y_n+t^*(z_n-y_n),\quad x_{n+1}=y_n+t_{\textnormal{opt}}(z_n-y_n). \end{eqnarray}

From this and from (3.52) we obtained

\begin{equation} x^*-x_{n+1}=(t^*-t_{\textnormal{opt}})(z_n-y_n)\approx \mathcal{O}(\varepsilon ^{2k+1}_n)\mathcal{O}(\varepsilon ^{2}_n)=\mathcal{O}(\varepsilon ^{2k+3}_n), \end{equation}
3.54

i.e., we have

\begin{eqnarray} |x^*-x_{n+1}|\leq M_1|x^*-x_n|^{2k+3},\nonumber \end{eqnarray}

which means that the convergence order of iteration (B) is equal to \(d=2k+3\), \(k=1,2,3\).

Proof â–¼

From the Theorem 3.5, we see that the convergence order of iteration (B\('\)) can be increased two or four units at the expense of only two additional evaluations of the function. So the order of convergence and the computational efficiency of the method are greatly improved.

In [ 5 ] Algorithm 2 was constructed:

\begin{eqnarray} z_n=x_n-\tfrac {(x_n)}{f’(x_n)},\quad x_{n+1}=z_n-H(x_n,y_n)\tfrac {f(z_n)}{f’(x_n)},\nonumber \end{eqnarray}

and it is proved that the order of convergence equals 5, 6, 7 depending on a suitable choice of two-variable function \(H(x_n,y_n)\). For comparison purpose we can rewrite iteration (B) as

\begin{eqnarray} y_n=x_n-\tfrac {f(x_n)}{f’(x_n)},\quad x_{n+1}=y_n-t\tfrac {f(y_n)}{f’(y_n)}.\nonumber \end{eqnarray}

We see that these two methods are different from one another only by chosen factors \(t\) and \(H(x_n,y_n)\).

Now we consider the following iteration:

\begin{equation} \tag {C} y_n=x_n-\tfrac {f(x_n)}{f’(x_n)},\: z_n=y_n-\tfrac {f(y_n)}{f’(y_n)},\: x_{n+1}=y_n+t(z_n-y_n),\: n=0,1,\ldots \end{equation}
3.55

The iteration (C) can be considered as improvement of iteration

\begin{equation} \tag {C$'$} y_n=x_n-\tfrac {f(x_n)}{f’(x_n)},\quad x_{n+1}=y_n-\tfrac {f(y_n)}{f’(y_n)},\quad n=0,1,\ldots , \end{equation}
3.56

since if \(t\equiv 1\) in (C), then it leads to (C\('\)).

In [ 16 ] , it was proven that the convergence order of (C\('\)) is four.

Theorem 3.6

The assumptions of Theorem 3.2 are fulfilled. Then the convergence order of iteration (C) with optimal \(t_{\textnormal{opt}}\) equal to \(d=2(k+2)\), \(k=1,2,3\).

Proof â–¼
If we compare (C) with (3.8), then \(x_n:=y_n\), \(x_{n+m}:=z_n\), \(y_n:=x_{n+1}\). Therefore, the expression (3.29) reads as
\begin{eqnarray} \tfrac {|x^*-x_{n+1}|}{|x^*-y_n|^{k+2}}=\mathcal{O}(1)\Longleftrightarrow |x^*-x_{n+1}|\leq M|x^*-y_n|^{k+2}.\label{eqnl38} \end{eqnarray}

From (C), we find that

\begin{eqnarray} x^*-y_n=x^*-x_n+\tfrac {f(x_n)-f(x^*)}{f’(x_n)}. \end{eqnarray}

Substituting here the expansion of \(f(x^*)\)

\begin{eqnarray} f(x^*)=f(x_n)+f’(x_n)(x^*-x_n)+\tfrac {f”(\xi _n)}{2}(x^*-x_n)^2, \end{eqnarray}

we have

\begin{eqnarray} |x^*-y_n|\leq \tfrac {|f”(\xi _n)|}{|f’(x_n)|}|x^*-x_n|^2. \end{eqnarray}

Using the last estimate in (3.52), we obtain

\begin{eqnarray} |x^*-x_{n+1}|\leq M_1|x^*-x_n|^{2(k+2)},\nonumber \end{eqnarray}

which means that the convergence order of iteration (C) equals \(d=2(k+2)\), \(k=1,2,3\).

Proof â–¼
Note that the iterations (A), (B) and (C) can be rewritten as a damped Newton’s method [ 20 ]
\begin{align} x_{n+1}=& x_n-\tau _n\tfrac {f(x_n)}{f’(x_n)},\label{eqn46} \\ \tau _n=& t_{\textnormal{opt}},\label{eqn47} \\ \tau _n=& 1+t_{\textnormal{opt}}\tfrac {f(y_n)}{f(x_n)},\label{eqn48} \\ \tau _n=& 1+t_{\textnormal{opt}}\tfrac {f(y_n)}{f(x_n)}\tfrac {f’(x_n)}{f’(y_n)},\label{eqn49} \end{align}

respectively. The unified representation (3.61) of different iterations shows that the choice of the damped parameter \(\tau _n\) in (3.61) is essentially affected for the convergence order. Of course, the parameter \(\tau _n\) in (3.61) is defined by different ways, but in all cases \(\tau _n\rightarrow 1\) as \(n\rightarrow \infty \).

The speed of convergence of sequence \(\{ \tau _n\} \) to unit is different for each iteration methods. In [ 17 ] the conjecture was proposed:

\begin{eqnarray} |1-\tau _n|\leq Mq^{\rho ^n},\quad 0{\lt}q{\lt}1.\label{eqn50} \end{eqnarray}

Now we consider the following three-point iterative method:

\begin{align} y_n=& x_n-\tfrac {f(x_n)}{f’(x_n)},\quad z_n=x_n+\bar{t}(y_n-x_n),\nonumber \\ x_{n+1}=& y_n+t(z_n-y_n),\quad n=0,1,\ldots ,\tag {D} \end{align}

where \(\bar{t}\) and \(t\) in (D) are some parameters to be determined. We can formulate the following theorem.

Theorem 3.7

Assume that \(f(x)\in \mathcal{C}^4(a,b)\) and an initial approximation \(x_0\) is sufficiently close to the zero \(x^*\in (a,b)\) and the parameter \(\bar{t}\) is chosen by as a root of equation

\begin{eqnarray} \theta _n\bar{t}^2-\bar{t}+1=0,\quad \theta _n=\tfrac {f(y_n)}{f(x_n)},\label{eqt9a} \end{eqnarray}

and \(t\) is a root of equation

\begin{eqnarray} \Psi (t,\alpha )=\alpha \Psi _1(t)+(1-\alpha )\Psi _2(t)=0,\label{eqt9} \end{eqnarray}

where

\begin{align} \Psi _1(t)& =at^2-\big(a+\tfrac {f(x_n)}{f(y_n)}\big(f(z_n)-f(y_n)\big)\big)t-f(x_n),\label{eqt10}\\ a& =-2f(z_n)-f(x_n)(1-\bar{t})^2,\nonumber \end{align}

and

\begin{align} \Psi _2(t)& =\big((1-\bar{t})(2-\bar{t})f(x_n)-(2-3\bar{t})f(z_n)\big)t\nonumber \\ & \quad +(1-\bar{t})\big(2f(z_n)-(2-\bar{t})f(x_n)\big).\label{eqt101} \end{align}

Then the three-point methods (D) is of eight order of convergence.

Proof â–¼
Using \(z_n-y_n=(1-\bar{t})\displaystyle \tfrac {f(x_n)}{f’(x_n)}\) in Taylor expansion
\begin{eqnarray} f(x_{n+1})=f(y_n)+f’(y_n)t(z_n-y_n)+\tfrac {f”(y_n)}{2}t^2(z_n-y_n)^2+\mathcal{O}\big((z_n-y_n)^3\big),\nonumber \end{eqnarray}

we get

\begin{align} f(x_{n+1})& =f(y_n)+t(1-\bar{t})\tfrac {f’(y_n)}{f’(x_n)}f(x_n)\nonumber \\ & \quad +\tfrac {f”(y_n)}{2}t^2(1-\bar{t})^2\tfrac {f^2(x_n)}{\big(f’(x_n)\big)^2}+\mathcal{O}\big(f^6(x_n)\big).\label{eqt11} \end{align}

Analogously, the Taylor expansion of \(f(x_{n+1})\) at point \(x=z_n\) gives

\begin{align} f(x_{n+1})& =f(z_n)-(1-t)(1-\bar{t})\tfrac {f’(z_n)}{f’(x_n)}f(x_n)\nonumber \\ & \qquad +\tfrac {f”(z_n)}{2}(1-t)^2(1-\bar{t})^2\tfrac {f^2(x_n)}{\big(f’(x_n)\big)^2}+\mathcal{O}\big((1-t)^3f^6(x_n)\big).\label{eqtt11} \end{align}

Using \(f'(z_n)=f'(y_n)+f''(y_n)(z_n-y_n)+\mathcal{O}\big((z_n-y_n)^2\big)\) in the last expansion, we have

\begin{align} f(x_{n+1})& =f(z_n)-(1-t)(1-\bar{t})\tfrac {f’(z_n)}{f’(x_n)}f(x_n)\nonumber \\ & \quad +\tfrac {f”(z_n)}{2}(1-t)^2(1-\bar{t})^2\tfrac {f^2(x_n)}{\big(f’(x_n)\big)^2}+\mathcal{O}\big((1-t)^3f^6(x_n)\big). \end{align}

Using \(f'(z_n)=f'(y_n)+f''(y_n)(z_n-y_n)+\mathcal{O}\Bigl((z_n-y_n)^2\Bigr)\) in the last expansion, we have

\begin{align} f(x_{n+1})\! =& f(z_n)-(1-t)(1-\bar{t})\tfrac {f’(y_n)}{f’(x_n)}f(x_n)\nonumber \\ & -\tfrac {f”(y_n)f^2(x_n)}{2\big(f’(x_n)\big)^2}(1-t^2)(1-\bar{t})^2\! +\! \mathcal{O}\big(f^8(x_n)\big).\label{eqt12} \end{align}

From (3.71) and (3.74) one can eliminate term with \(\displaystyle \tfrac {f’(y_n)}{f’(x_n)}f(x_n)\). As a result, we have

\begin{align} f(x_{n+1})\! =& tf(z_n)\! +\! (1\! -\! t)f(y_n) \! -\! \tfrac {f”(y_n)f^2(x_n)}{2\big(f’(x_n)\big)^2}(1\! -\! \bar{t})^2t(1\! -\! t)\! +\! \mathcal{O}\big(f^8(x_n)\big).\label{eqt13} \end{align}

Note that in driving (3.75), we keep in mind that

\begin{eqnarray} 1-t=\mathcal{O}\big(f^2(x_n)\big).\label{eqt14} \end{eqnarray}

Further, using Taylor expansion of \(f(x)\in \mathcal{C}^4(\mathfrak {D})\) at points \(y_n\) we obtain

\begin{align} f”(y_n)=& \tfrac {2\big(f’(x_n)\big)^2}{f^2(x_n)\bar{t}(1-\bar{t})}\big[(1-\bar{t})f(x_n)+\bar{t}f(y_n)-f(z_n)\big)\nonumber \\ & -\tfrac {f”’(y_n)}{3}\frac{f(x_n)}{f'(x_n)}(2-\bar{t})+\mathcal{O}\big(f^2(x_n)\big].\label{eqt15} \end{align}

The same technique gives us

\begin{align} f”(y_n)=& \tfrac {2\big(f(z_n)-(1-\bar{t})f(x_n)\big)}{\bar{t}^2f^2(x_n)}\big(f’(x_n)\big)^2 \! -\! \tfrac {f”’(y_n)}{3}\tfrac {f(x_n)}{f’(x_n)}(3\! -\! \bar{t})\! +\! \mathcal{O}\big(f^2(x_n)\big).\label{eqt16} \end{align}

For (3.77) and (3.78) one can eliminate the term with \(f'''(y_n)\). As a result, we obtain

\begin{align} \tfrac {f”(y_n)f^2(x_n)}{2\big(f’(x_n)\big)^2}=& \tfrac {1}{\bar{t}^2(1-\bar{t})}\big((3-\bar{t})\bar{t}\big(-f(z_n)+\bar{t}f(y_n)+(1-\bar{t})f(x_n)\big)\big.\nonumber \\ & \Biggl.-(2-\bar{t})(1-\bar{t})\Bigl(f(z_n)-(1-\bar{t})f(x_n)\Bigr)\Bigr)+\mathcal{O}\Bigl(f^4(x_n)\Bigr).\label{eqt17} \end{align}

Substituting (3.79) into (3.74), we obtain

\begin{eqnarray} f(x_{n+1})=\Psi _1(t)+\mathcal{O}\big(f^8(x_n)\big),\label{eqt18} \end{eqnarray}

where

\begin{align} \Psi _1(t)& =at^2-\big(a+\tfrac {f(x_n)}{f(y_n)}\big(f(z_n)-f(y_n)\big)\big)t-f(x_n),\label{eqtt18}\\ a& =-2f(z_n)-f(x_n)(1-\bar{t})^2.\nonumber \end{align}

On the other hand, by virtue of (D) we have

\begin{eqnarray} x_{n+1}-z_n=-(1-\bar{t})(1-t)\tfrac {f(x_n)}{f’(x_n)}.\label{lin1} \end{eqnarray}

If we take (3.67) and (3.76) into account in (3.82), from it we deduce

\begin{eqnarray} x_{n+1}-z_n=\mathcal{O}\bigl(f^4(x_n)\bigr).\nonumber \end{eqnarray}

Then, from (3.72) we get

\begin{eqnarray} f(x_{n+1})=f(z_n)+(1-\bar{t})(1-t)\tfrac {f’(z_n)}{f’(x_n)}f(x_n)+\mathcal{O}\bigl(f^8(x_n)\bigr).\label{lin2} \end{eqnarray}

Now we approximate \(f'(z_n)\) by the method of undetermined coefficient such that

\begin{equation} f'(z_n)\approx a_nf(x_n)+b_nf(y_n)+c_nf(z_n)+d_nf'(x_n)+\mathcal{O}\bigl(f^4(x_n)\bigr),\label{lin3} \end{equation}
3.84

This can be done by means of Taylor expansion of \(f(x)\in \mathcal{C}^4(a,b)\) at point \(z_n\) and we obtain the following linear system of equations

\begin{eqnarray} \left\{ \begin{array}{l} a_n+b_n+c_n=0,\\[1mm] a_n(x_n-z_n)+b_n(y_n-z_n)+d_n=1,\\[1mm] a_n(x_n-z_n)^2+b_n(y_n-z_n)^2+2d_n(x_n-z_n)=0,\\[1mm] a_n(x_n-z_n)^3+b_n(y_n-z_n)^3+2d_n(x_n-z_n)^2=0, \end{array} \right. \nonumber \end{eqnarray}

which has a unique solution

\begin{eqnarray} & & a_n=\tfrac {\beta _n(2\beta _n-3\omega _n)}{\omega _n(\beta _n-\omega _n)^2},\quad b_n=\tfrac {\omega _n^2}{\beta _n(\beta _n-\omega _n)^2},\\ \nonumber & & c_n=-\tfrac {2\beta _n+\omega _n}{\beta _n\omega _n},\quad d_n=-\tfrac {\beta _n}{\beta _n-\omega _n},\label{lin4} \end{eqnarray}

where

\begin{eqnarray} \omega _n=x_n-z_n=\bar{t}\tfrac {f(x_n)}{f’(x_n)},\quad \beta _n=y_n-z_n=(1-\bar{t})\tfrac {f(x_n)}{f’(x_n)}.\nonumber \end{eqnarray}

Substituting (3.84) with coefficients defined by (??) into (3.83), we get

\begin{eqnarray} f(x_{n+1})=\Psi _2(t)+\mathcal{O}\big(f^8(x_n)\big),\label{lin5} \end{eqnarray}

where

\begin{align} \Psi _2(t)& =\big((1-\bar{t})(2-\bar{t})f(x_n)-(2-3\bar{t})f(z_n)\big)t\\ \nonumber & \quad +(1-\bar{t})\big(2f(z_n)-(2-\bar{t})f(x_n)\big).\label{lin6} \end{align}

The linear combination of (3.80) and (3.86) gives

\begin{eqnarray} f(x_{n+1})=\alpha \Psi _1(t)+(1-\alpha )\Psi _2(t)+\mathcal{O}\big(f^8(x_n)\big).\nonumber \end{eqnarray}

Clearly, if we choose \(t\) as a root of quadratic equations

\begin{eqnarray} \Psi (t,\alpha )=\alpha \Psi _1(t)+(1-\alpha )\Psi _2(t)=0,\label{lin7} \end{eqnarray}

then we have

\begin{eqnarray} f(x_{n+1})=\mathcal{O}\big(f^8(x_n)\big)\nonumber \end{eqnarray}

which completes the proof.

Proof â–¼

Remark 3.8

Since \(t\) is a root of Eq. (3.88), it depends on the parameter \(\alpha \), i.e. \(t=t(\alpha )\). Therefore, (D) are one parameter family of iterations.â–¡

It is easy to show that

\begin{eqnarray} \Psi _2(\hat{t})=0,\quad \hat{t}\rightarrow 1,\quad \Psi _1(\breve{t})=0,\quad \breve{t}\rightarrow 1.\nonumber \end{eqnarray}

Then taking this into account and passing to the limit \(t\rightarrow 1\) in Eq. (3.88), we get

\begin{eqnarray} \Psi (t,\alpha )\xrightarrow []{t\rightarrow 1}\alpha \Psi _1(1)+(1-\alpha )\Psi _2(1)\approx 0.\nonumber \end{eqnarray}

It means that Eq. (3.88) or (3.68) has a root tending to unit for any \(\alpha \in [0,1]\).

We recall that, according to Kung-Traub hypothesis, the order of convergence of any multipoint method without memory cannot exceed the bound \(2^{n-1}\) (called optimal order), where \(n\) is the number of function evaluations per iteration. As is known, the efficiency index of iteration defined by formula \(E=d^{\frac1m}\), where \(d\) is the convergence order and \(m\) is the number of function and its derivative evaluations per iteration. Therefore, the optimal efficiency index would be \(2^{\frac{n-1}{n}}\).

According to the Theorem 3.4, the iteration (A) has the convergence order fourth for \(k=2\), requiring only three function evaluations (\(f(x_n)\), \(f(y_n)\) and \(f'(x_n)\)), whereas Theorem 3.7 shows that the iteration (D) has the convergence order eight, requiring four function evaluations \((f(x_n),f(y_n),f(z_n),f'(x_n))\).

Methods

\(k\)

\(d\)

\(E\)

 

1

3

\(3^{\frac13}\approx 1.442249\)

(A)

2

4

\(4^{\frac13}\approx 1.587401\)

 

3

5

\(5^{\frac14}\approx 1.495348\)

 

1

5

\(5^{\frac14}\approx 1.495348\)

(B)

2

7

\(7^{\frac15}\approx 1.475773\)

 

3

9

\(9^{\frac15}\approx 1.551845\)

 

1

6

\(6^{\frac15}\approx 1.430969\)

(C)

2

8

\(8^{\frac15}\approx 1.515716\)

 

3

10

\(10^{\frac16}\approx 1.467799\)

(D)

-

8

\(8^{\frac14}\approx 1.681792\)

Table 3. The efficiency index of the methods (A), (B), (C) and (D).

Hence, this order of convergence is optimal in the above mentioned sense of the Kung-Traub conjecture. This efficiency index is \(4^{\frac13}\approx 1.587\) and \(8^{\frac14}\approx 1.681\), respectively.

Thus, we obtain the iterations (A) and (D) with the optimal order of convergence 4 and 8, accelerating Newton’s method. Our procedure of accelerations gives a genuine improvement of Newton’s method. One of the advantages of iterations (A) and (D) is that these methods work well for the system of nonlinear equations, whereas the optimal order methods in [ 1 , 2 , 3 , 10 , 12 ] do not extend to the system of equations.

For convenience we present the efficiency index of the proposed above methods (A), (B), (C) and (D) in Table 3.. From Table 3. one can see that the efficiency index of the iterations (A), (B), (C) and (D) is better or much better than that of Newton’s method \(\sqrt2\approx 1.414.\)

4 Finding optimal parameter

Let \(m=1\) in (3.8). Then the root (3.18) can be written as

\begin{eqnarray} t^{(1)}_{\textnormal{opt}}=\tfrac {1}{1-\theta _n},\quad \theta _n=\tfrac {f(y_n)}{f(x_n)}.\@bsphack \if@filesw {\let\thepage \relax \def\protect {\noexpand \noexpand \noexpand }\edef\@tempa {\write \@auxout {\string \newlabel {eqn8'}{{\thesubequation }{\thepage }}}}}\write\@auxout {\string \newlabel {eqn8'}{{\thesubequation }{\thepage }}}\if@nobreak \fi \fi \@esphack \end{eqnarray}

For \(k=2\), from (??) we obtain

\begin{eqnarray} P_2(t)\equiv f(y_n)t^2-f(x_n)t+f(x_n)=0,\@bsphack \if@filesw {\let\thepage \relax \def\protect {\noexpand \noexpand \noexpand }\edef\@tempa {\write \@auxout {\string \newlabel {eqn9a}{{\thesubequation }{\thepage }}}}}\write\@auxout {\string \newlabel {eqn9a}{{\thesubequation }{\thepage }}}\if@nobreak \fi \fi \@esphack \end{eqnarray}

or

\begin{eqnarray} P_2(t)\equiv \theta _nt^2-t+1=0.\@bsphack \if@filesw {\let\thepage \relax \def\protect {\noexpand \noexpand \noexpand }\edef\@tempa {\write \@auxout {\string \newlabel {eqn9b}{{\thesubequation }{\thepage }}}}}\write\@auxout {\string \newlabel {eqn9b}{{\thesubequation }{\thepage }}}\if@nobreak \fi \fi \@esphack \end{eqnarray}

By the well known assertion and (3.19) we have

\begin{eqnarray} t_1+t_2=\tfrac {f(x_n)}{f(y_n)}{\gt}1,\quad t_1t_2=\tfrac {f(x_n)}{f(y_n)}.\nonumber \end{eqnarray}

Hence

\begin{eqnarray} t_1+t_2=t_1t_2.\nonumber \end{eqnarray}

From this we obtian

\begin{eqnarray} t^2_1-\tfrac {f(x_n)}{f(y_n)}t_1+\tfrac {f(x_n)}{f(y_n)}=0.\label{eqn9'} \end{eqnarray}

The root of (4.93) greater than 1 is

\begin{eqnarray} t^{(2)}_{\textnormal{opt}}=\tfrac {1-\sqrt{1-4\theta _n}}{2\theta _n}=\tfrac {2}{1+\sqrt{1-4\theta _n}}.\label{eqn10} \end{eqnarray}

In a similar way, from (3.6) and (??) we obtain

\begin{eqnarray} P_3(t)=(\theta _n-\omega _n)t^3+\omega _nt^2-t+1=0,\label{eqn11} \end{eqnarray}

where

\begin{eqnarray} \omega _n=\tfrac {f”(x_n)f(x_n)}{2\bigl(f’(x_n)\bigr)^2}.\label{eqn12} \end{eqnarray}

Since in all iterations (A), (B), (C) we have

\begin{eqnarray} f(y_n)=\tfrac {f”(x_n)}{2}\tfrac {f_n^2}{(f’_n)^2}+\mathcal{O}\big(f^3(x_n)\big), \end{eqnarray}

then

\begin{eqnarray} \omega _n=\tfrac {f(y_n)}{f(x_n)}+\mathcal{O}\big(f^2(x_n)\big).\label{eql14} \end{eqnarray}

Using (3.26) and (4.98) in (4.95) we obtain approximates equation

\begin{eqnarray} \left(\tfrac {f(z_n)}{f(y_n)}-\tfrac {f(y_n)}{f(x_n)}\right)t^3+\tfrac {f(y_n)}{f(x_n)}t^2-t+1=0.\label{eql15} \end{eqnarray}

Eq. (4.99) approximates (4.95) with accuracy \(\mathcal{O}(f_n^4)\) in case of (A) and with accuracy \(\mathcal{O}(f_n^5)\) in case of (B) and (C) since \(t_{\textnormal{opt}}-1=\mathcal{O}(\varepsilon _n^2)\) for (A) and \(t_{\textnormal{opt}}-1=\mathcal{O}(\varepsilon _n^3)\) for (B) and (C). Therefore, Eq. (4.99) may be useful especially for (A).

Above, we obtain formula (4.90) for finding optimal value \(t_{\textnormal{opt}}\) for iteration (A). However, it may be changed for iteration (B) and (C). Since \(x_{n+m}:=z_n\), \(x_n:=y_n\) and \(y_n:=x_{n+1}\) for iteration (B), then according to (??) we have

\begin{eqnarray} P_2(t)\equiv f(y_n)+f’(y_n)(z_n-y_n)t+\big(f(z_n)-f(y_n)-f’(y_n)(z_n-y_n)\big)t^2=0,\nonumber \end{eqnarray}

or

\begin{eqnarray} P_2(t)\equiv \left(\tfrac {f(z_n)}{f(y_n)}-1+\tfrac {f’(y_n)}{f’(x_n)}\right)t^2-\tfrac {f’(y_n)}{f’(x_n)}t+1=0.\label{eql40} \end{eqnarray}

We rewrite (4.100) as

\begin{eqnarray} P_2(t)=\tfrac {f(z_n)}{ f(y_n)}t^2-t+1+\Big(1-\tfrac {f’(y_n)}{f’(x_n)}\Big)t(1-t)=0. \end{eqnarray}

From the last equation, it is clear that if we take into account the following estimate

\begin{eqnarray} 1-\tfrac {f’(y_n)}{f’(x_n)}=2\tfrac {f(y_n)}{f(x_n)}+\mathcal{O}\big(f^2(x_n)\big)=\mathcal{O}\big(f(x_n)\big)\label{eql41} \end{eqnarray}

and

\begin{eqnarray} 1-t=\mathcal{O}\big(f(x_{n+m})\big)=\mathcal{O}\big(f(z_n)\big)=\mathcal{O}\bigr(f^3(x_n)\big),\label{eql42} \end{eqnarray}

which follows from (3.26), then the equation (??) with \(\theta _n=\frac{f(z_n)}{f(y_n)}\) holds within the accuracy \(\mathcal{O}\big(f^4(x_n)\big)\).

If we wish to include the precise correction to (4.100), one can replace \(1-\displaystyle \tfrac {f’(y_n)}{f’(x_n)}\) by \(2\displaystyle \tfrac {f(y_n)}{f(x_n)}\), then we arrive at

\begin{eqnarray} \left(\theta _n-2\tfrac {f(y_n)}{f(x_n)}\right)t^2+\left(2\tfrac {f(y_n)}{f(x_n)}-1\right)t+1=0.\label{eql43} \end{eqnarray}

By virtue of (4.102) and (4.103), Eq. (4.104) approximates Eq. (4.100) with accuracy \(\mathcal{O}\big(f^5(x_n)\big)\).

With respect to the iteration (C), Eq. (??) remains true with \(\theta _n=\frac{f(z_n)}{f(y_n)}\).

Note that in most cases the value of the iteration parameter of the damped Newton’s method varies from zero to unit, whereas in our case the value of the optimal parameter may be greater than unit.

5 Numerical experiments

We consider the following four examples [ 18 , 2 , 12 , 8 ] .

Example 5.1

Let \(f(x)=\exp (x)-4x^2=0\). This equation has three roots. It is easy to show that

\begin{eqnarray} & & (\textbf{a})\quad f’(x){\gt}0,\quad f”(x){\gt}0\quad \textnormal{at}\quad x\in \bigl[4,\ \tfrac 92\big],\quad \textnormal{and}\quad x^*\in \big(4,\ \tfrac 92\big), \nonumber \\ & & (\textbf{b})\quad f’(x){\gt}0,\quad f”(x){\lt}0\quad \textnormal{at}\quad x\in \bigl[-\tfrac 12,\ 0\bigr],\quad \textnormal{and}\quad x^*\in \bigl(-\tfrac 12,\ 0\bigr). \nonumber \end{eqnarray}

We considered only first and third roots.

Method

\(k\)

\(|x^*-x_0|\)

\(|x^*-x_1|\)

\(|x^*-x_2|\)

\(|x^*-x_3|\)

\(d_{x_2}\)

\(d_{x_3}\)

\(\rho _{\tau _2}\)

\(\rho _{\tau _3}\)

 

1

1.93(-01)

3.87(-03)

4.00(-08)

4.45(-023)

2.93

 3.00

2.93

 3.00

(A)

2

1.93(-01)

3.48(-04)

3.80(-15)

5.40(-059)

3.99

 4.00

3.99

 4.00

 

3

1.93(-01)

1.68(-05)

8.74(-26)

3.31(-127)

5.00

 5.00

5.00

 5.00

 

1

1.93(-01)

1.43(-04)

5.70(-20)

5.78(-097)

4.92

 5.00

4.92

 5.00

(B)

2

1.93(-01)

1.46(-06)

4.15(-42)

6.35(-291)

6.94

 7.00

6.94

 7.00

 

3

1.93(-01)

9.66(-09)

4.56(-74)

5.31(-662)

8.94

 9.00

8.94

 9.00

 

1

1.93(-01)

1.24(-05)

1.47(-30)

4.13(-180)

5.95

 6.00

5.95

 6.00

(C)

2

1.93(-01)

1.26(-07)

8.02(-57)

2.14(-450)

7.95

 8.00

7.95

 8.00

 

3

1.93(-01)

8.38(-10)

4.41(-93)

7.23(-926)

9.96

10.00

9.96

10.00

Table 5. Example 1a. \(x^*=4.306584\ldots \)

Method

\(k\)

\(|x^*-x_0|\)

\(|x^*-x_1|\)

\(|x^*-x_2|\)

\(|x^*-x_3|\)

\(d_{x_2}\)

\(d_{x_3}\)

\(\rho _{\tau _2}\)

\(\rho _{\tau _3}\)

 

1

9.22(-02)

5.38(-04)

1.36(-010)

2.18(-0030)

2.95

 3.00

2.95

 3.00

(A)

2

9.22(-02)

1.56(-06)

1.56(-025)

1.55(-0101)

3.98

 4.00

3.98

 4.00

 

3

9.22(-02)

3.56(-08)

3.77(-040)

5.04(-0200)

4.99

 5.00

4.99

 5.00

 

1

9.22(-02)

6.10(-06)

1.29(-026)

5.39(-0130)

4.95

 5.00

4.95

 5.00

(B)

2

9.22(-02)

1.26(-09)

2.17(-064)

9.62(-0448)

6.96

 7.00

6.96

 7.00

 

3

9.22(-02)

2.14(-12)

9.57(-108)

6.74(-0966)

8.96

 9.00

8.96

 9.00

 

1

9.22(-02)

2.70(-07)

2.76(-040)

3.13(-0238)

5.96

 6.00

5.96

 6.00

(C)

2

9.22(-02)

5.57(-11)

1.87(-084)

2.96(-0672)

7.97

 8.00

7.97

 8.00

 

3

9.22(-02)

9.48(-14)

2.74(-133)

1.12(-1328)

9.97

10.00

9.97

10.00

Table 5. Example 1b. \(x^*=-0.4077767\ldots \)

Example 5.2

\(f(x)=x^2-2\cos (x)=0\). This equation has two roots. It is also easy to show that

\begin{eqnarray} & & f’(x){\gt}0,\ f”(x){\gt}0\ \textnormal{at}\ x\in \bigl[\tfrac {\pi }{6},\ \tfrac {\pi }{2}\bigr],\ \textnormal{and} x^*\in \big(\tfrac {\pi }{6},\ \tfrac {\pi }{2}\big), \nonumber \end{eqnarray}

We considered only first root, because \(f(x)\) is an even function with respect to \(x\).

Method

\(k\)

\(|x^*-x_0|\)

\(|x^*-x_1|\)

\(|x^*-x_2|\)

\(|x^*-x_3|\)

\(d_{x_2}\)

\(d_{x_3}\)

\(\rho _{\tau _2}\)

\(\rho _{\tau _3}\)

 

1

5.49(-01)

1.11(-02)

2.18(-07)

1.71(-021)

2.77

 3.00

2.77

 3.00

(A)

2

5.49(-01)

1.73(-03)

2.73(-13)

1.71(-052)

3.92

 4.00

3.92

 4.00

 

3

5.49(-01)

5.18(-05)

1.76(-24)

7.93(-122)

4.84

 5.00

4.84

 5.00

 

1

5.49(-01)

4.63(-04)

1.16(-18)

1.12(-091)

4.75

 5.00

4.75

 5.00

(B)

2

5.49(-01)

6.44(-06)

1.90(-39)

3.62(-274)

6.80

 7.00

6.80

 7.00

 

3

5.49(-01)

6.17(-08)

3.33(-69)

1.29(-620)

8.81

 9.00

8.81

 9.00

 

1

5.49(-01)

4.84(-05)

1.41(-28)

8.72(-170)

5.80

 6.00

5.80

 6.00

(C)

2

5.49(-01)

6.65(-07)

3.21(-53)

9.36(-424)

7.83

 8.00

7.83

 8.00

 

3

5.49(-01)

6.42(-09)

6.22(-87)

4.48(-867)

9.84

10.00

9.84

10.00

Table 5. Example 2. \(x^*=1.021689\ldots \)

Example 5.3

Let \(f(x)=(x-2)(x^{10}+x+1)\exp (-x-1)=0\). We chose the initial approximation \(x_0=2.1\) for \(x^*=2\).

Method

\(|x^*-x_1|\)

\(|x^*-x_2|\)

\(|x^*-x_3|\)

\(d_{x_3}\)

\(h(t)=1+\frac{4t}{2-5t},\beta =3 \) in [ 1 , (14) ]

1.83(-5)

3.15(-34)

2.45(-264)

7.99986

\(h(t)=\frac{1}{1-2t-t^2+t^3},\beta =3 \) in [ 1 , (14) ]

6.02(-6)

7.91(-38)

6.99(-293)

8.00007

\(\psi (t)=\frac{5-2t+t^2}{5-12t}\) in [ 12 , (12) ]

6.12(-5)

1.11(-29)

1.34(-224)

7.99947

\(\psi (t)=\frac{1}{1-2t-t^2}\) in [ 12 , (12) ]

6.01(-5)

9.29(-30)

3.02(-228 )

8.00050

(D), \(\alpha =0\)

2.18(-5)

1.12(-34)

5.40(-269)

7.99999

(D), \(\alpha =0.5\)

2.14(-5)

2.25(-34)

3.39(-266)

8.00003

(D), \(\alpha =1\)

2.89(-5)

2.45(-33)

6.63(-258)

7.99999

Table 5. Example 3. \(x^*=2\)

All numerical calculations were performed using Maple 16 system. Also, to study the convergence of iterations (3.6), (A), (B) (C) and (D), we compute the computational order of convergence \(d_{x_n}\) using the formulae [ 14 ]

\begin{eqnarray} d_{x_n}=\tfrac {\ln \bigl(|x_{n+1}-x^*|/|x_n-x^*|\bigr)}{\ln \bigl(|x_{n}-x^*|/|x_{n-1}-x^*|\bigr)},\label{eq44a} \end{eqnarray}

where \(x_{n+1}\), \(x_n\), \(x_{n-1}\) are three consecutive approximations. In numerical examples we also check out the computational order of convergence (COC) of \(\tau _n\) by formula [ 14 ]

\begin{eqnarray} \rho _{\tau _n}=\tfrac {\ln |(\tau _{n+1}-1)/(\tau _n-1)|}{\ln |(\tau _{n}-1)/(\tau _{n-1}-1)|},\label{eqn51} \end{eqnarray}

which is included in the presented tables (see Tables 5.5.) and agrees with the conjecture.

Comparisons of the convergence of the iterations (A), (B) and (C) are given in Tables 5.5.. The third, fourth, fifth and sixth columns show the absolute errors \(|x^*-x_n|\) in the first four iterations. The last four columns display the computational order of convergence \(d_{x_2}\), \(d_{x_3}\), \(\rho _{\tau _2}\) and \(\rho _{\tau _3}\), respectively. The factor \(l\) in the brackets denotes \(10^l\). As expected, the convergence of the proposed methods was remarkably fast. A comparison of the convergence of (D) iteration with other optimal order iterations with eighth order of convergence [ 1 , 12 ] is given in Table 5.. From the Tables we see that the COC perfectly coincides with the theoretical order.

Conclusions

We propose a new acceleration procedure for Newton-type methods. The effect of the acceleration is more perceptible when \(k\) increases. The proposed accelerating procedure allows us to derive high and optimal order iteration methods. Numerical results clear demonstrate the theoretical analysis (speed of convergence, and effect of acceleration). Moreover, our acceleration procedure can also be applied to any iteration and systems of nonlinear equations, to which a forthcoming paper will be devoted.

Acknowledgement

The authors would like to thank the anonymous referee for the valuable comments and suggestions which substantially improved the quality of this article. The work was supported partially by the Foundation of Science and Technology of Mongolia under grant SST\(\_ \)007/2015. O. Ch. acknowledges support within the Hulubei-Meshcheryakov programme JINR-Romania.

Bibliography

1

W. Bi, H. Ren, Q. Wu,Three-step iterative methods with eighth-order convergence for solving nonlinear equations, J. Comput. Appl. Math., 225 (2009), pp. 105–112. \includegraphics[scale=0.1]{ext-link.png}

2

A. Cordero, J.L. Hueso, E. Martinez, J. R. Torregrosa, New modifications of Potra-Pt\(\acute{a}\)k’s method with optimal fourth and eighth orders of convergence, J. Comput. Appl. Math., 234 (2010), pp. 2969–2976. \includegraphics[scale=0.1]{ext-link.png}

3

A. Cordero, J.R. Torregrosa, M.P. Vassileva, Three-step iterative methods with optimal eighth-order convergence, J. Comput. Appl. Math., 235 (2011), pp. 3189–3194. \includegraphics[scale=0.1]{ext-link.png}

4

J.A. Ezquerro, M.A. Hernandez, N. Romero, A.I. Velasco,Improving the domain of starting points for secant-like methods, Appl. Math. Comput., 219 (2012), pp. 3677–3692. \includegraphics[scale=0.1]{ext-link.png}

5

L. Fang, G. He, Some modifications of Newton’s method with higher-order convergence for solving nonlinear equations, J. Comput. Appl. Math., 228 (2009), pp. 296–303. \includegraphics[scale=0.1]{ext-link.png}

6

M.A. Hernandez, M.A. Salanova, Modification of the Kantorovich assumptions for semilocal convergence for the Chebyshev method, Comput. Appl. Math., 126 (2000), pp. 131–143. \includegraphics[scale=0.1]{ext-link.png}

7

L. Liu and X. Wang, Eighth-order methods with high efficiency index for solving nonlinear equations, Appl. Math. Comput., 215 (2010), pp. 3449–3454. \includegraphics[scale=0.1]{ext-link.png}

8

I. Pavaloiu, E. Catinas, Bilateral approximations for some Aitken-Steffensen-Hermite type methods of order three, Appl. Math. Comput., 217 (2011), pp. 5838–5846. \includegraphics[scale=0.1]{ext-link.png}

9

B.M. Podlevskii, On certain two-sided analogues of Newton’s method for solving nonlinear eigenvalue problems, Comput. Math. Math. Phys., 47 (2007), pp. 1745–1755. \includegraphics[scale=0.1]{ext-link.png}

10

H.I. Siyyam, M.T. Shatnawi, I.A. Al-Subaihi, A new one parameter family of iterative methods with eighth-order of convergence for solving nonlinear equations, Inter. J. Pure. Appl. Math., 84 (2013), pp. 451–461. \includegraphics[scale=0.1]{ext-link.png}

11

J.R. Sharma, H. Arora, On efficient weighted-Newton methods for sollving systems of nonlinear equations, Appl. Math. Comput., 222 (2013), pp. 497–506. \includegraphics[scale=0.1]{ext-link.png}

12

R. Thukral, M.S. Petkovic, A family of three-point methods of optimal order for solving nonlinear equations, J. Comput. Appl. Math., 233 (2010), pp. 2278–2284. \includegraphics[scale=0.1]{ext-link.png}

13

X. Wang, T. Zhang, A new family of Newton-type iterative methods with and without memory for solving nonlinear equations, Calcolo 51 (2014), pp. 1–15. \includegraphics[scale=0.1]{ext-link.png}

14

S. Weerakoon, T.G.I. Fernando, A variant of Newton’s method with accelerated third-order convergence, Appl. Math. Lett., 13 (8) (2000), pp. 87–93. \includegraphics[scale=0.1]{ext-link.png}

15

T. Zhanlav, Note on the cubic decreasing region of the Chebyshev method, J. Comput. Appl. Math., 235 (2010), pp. 341–344. \includegraphics[scale=0.1]{ext-link.png}

16

T. Zhanlav, O. Chuluunbaatar, Some iteration methods with high order convergence for nonlinear equations, Bulleten of PFUR, Series Mathematics.Information sciences. Physics, 4 (2009), pp. 47–55.

17

T. Zhanlav, O. Chuluunbaatar, Convergence of the continuous analogy of Newton’s method for solving nonlinear equations, Numerical methods and programming, Moscow State University, 10 (2009) pp. 402–407.

18

T. Zhanlav, O. Chuluunbaatar, V. Ulziibayar, Two-sided approximation for some Newton’s type methods, Appl. Math. Comput., 236 (2014), pp. 239–246. \includegraphics[scale=0.1]{ext-link.png}

19

T. Zhanlav, D. Khongorzul, On convergence behavior of combined iterative method for solving nonlinear equations, Comput. Math. Math. Phys., 52 (2012), pp. 790–800.

20

T. Zhanlav, I.V. Puzynin, The convergence of iteration based on a continuous analogue of Newton’s method, Comput. Math and Math Phys., 32 (1992), pp. 729–737.