On the Chebyshev method for approximating
the eigenvalues of linear operators

Emil Cătinaş and Ion Păvăloiu
(Cluj-Napoca)

1. Introduction

Approaches to the problem of approximating the eigenvalues of linear operators by the Newton method have been done in a series of papers ([1], [3], [4], [5]). There is a special interest in using the Newton method because the operatorial equation to be solved has a special form, as we shall see. We shall study in the following the convergence of the Chebyshev method attached to this problem and we shall apply the results obtained for the approximation of an eigenvalue and of a corresponding eigenvector for a matrix of real or complex numbers. It is known that the convergence order of Newton method is usually 2 and the convergence order of Chebyshev method is usually 3. Taking into account as well the number of operations made at each step, we obtain that the Chebyshev method is more efficient than the Newton method.

Let $E$ be a Banach space over $K$ , where $K={\mathbb{R}}$ or $K={\mathbb{C}}$ , and $T:E\rightarrow E$ a linear operator. It is well known that the scalar $\lambda$ is an eigenvalue of $T$ if the equation

(1)

Tx-\lambda x=\theta

has at least one solution $\overline{x}\neq\theta$ , where $\theta$ is the null element of the space $E$ . The elements $x\neq\theta$ that satisfy equation (1) are called eigenvectors of the operator $T$ , corresponding to the eigenvalue $\lambda$ .

For the simultaneously determination of the eigenvalues and eigenvectors of $T$ we can proceed in the following way.

We attach to the equation (1) an equation of the form

(2)

Gx=1

where $G$ is a linear functional $G:E\rightarrow K.$

Consider the real Banach space $F=E\times K$ , with the norm given by

(3)

\left\|u\right\|=\max\left\{\left\|x\right\|,\left|\lambda\right|\right\},% \quad u\in F,\ u=\tbinom{x}{\lambda},\text{ with }x\in E\text{ and }\lambda\in K.

In this space we consider the operator $f:F\rightarrow F$ given by

(4)

f\tbinom{x}{\lambda}=\left(\begin{array}[]{c}Tx-\lambda x\\ Gx-1\end{array}\right).

If we denote by $\theta_{1}=\tbinom{\theta}{0}$ the null element of the space $F$ , then the eigenvalues and the corresponding eigenvectors of the operator $T$ are solutions of the equation

(5)

f\left(u\right)=\theta_{1}.

Obviously, $f$ is not a linear operator.

It can be easily seen that the first order Fréchet derivative of $f$ has the following form [4]:

(6)

f^{\prime}\left(u_{0}\right)h=\left(\begin{array}[]{c}Th_{1}-\lambda_{0}h_{1}-% \lambda_{1}x_{0}\\ Gh_{1}\end{array}\right),

where $u_{0}=\tbinom{x_{0}}{\lambda_{0}}$ and $h=\tbinom{h_{1}}{\lambda_{1}}.$ For the second order derivative of $f$ we obtain the expression

(7)

f^{\prime\prime}\left(u_{0}\right)hk=\left(\begin{array}[]{c}-\lambda_{2}h_{1}% -\lambda_{1}h_{2}\\ 0\end{array}\right),

where $k=\tbinom{h_{2}}{\lambda_{2}}.$

The Fréchet derivatives of order higher than $2$ of $f$ are null.

Considering the above forms of the Fréchet derivatives of $f$ , in the following we shall study the convergence of the Chebyshev method for the operators having the third order Fréchet derivative the null operator.

2. The convergence of Chebyshev method

The iterative Chebyshev method for solving equation (5) consists in the successive construction of the elements of the sequence $\left(u_{n}\right)_{n\geq 0}$ given by

(8)

u_{n+1}=u_{n}-\Gamma_{n}f\left(u_{n}\right)-\tfrac{1}{2}\Gamma_{n}f^{\prime% \prime}\left(u_{n}\right)\big(\Gamma_{n}f\left(u_{n}\right)\big)^{2},\qquad n=% 0,1,\dots,\;u_{0}\in F,

where $\Gamma_{n}=\left[f^{\prime}\left(u_{n}\right)\right]^{-1}.$

Let $u_{0}\in F$ and $\delta>0,\;b>0$ be two real numbers. Denote $S=\{u\in F:$ $\left\|u-u_{0}\right\|\leq\delta\}.$ If

m_{2}=\sup_{u\in S}\left\|f^{\prime\prime}\left(u\right)\right\|,

then

	$\displaystyle\sup_{u\in S}\left\\|f^{\prime}\left(u\right)\right\\|$	$\displaystyle\leq\left\\|f^{\prime}\left(u_{0}\right)\right\\|+m_{2}\delta,$
	$\displaystyle\sup_{u\in S}\left\\|f\left(u\right)\right\\|$	$\displaystyle\leq\left\\|f\left(u_{0}\right)\right\\|+\delta\left\\|f^{\prime}% \left(u_{0}\right)\right\\|+m_{2}\delta^{2}=:m_{0}.$

Consider the numbers

(9)

\begin{array}[]{l}\mu=\frac{1}{2}m_{2}^{2}b^{4}\left(1+\tfrac{1}{4}m_{2}m_{0}b% ^{2}\right)\\[8.53581pt] \nu=b\left(1+\tfrac{1}{2}m_{2}m_{0}b^{2}\right)\end{array}

With the above notations, the following theorem holds:

Theorem 1.

If the operator $f$ is three times differentiable with $f^{\prime\prime\prime}\left(u\right)\equiv\theta_{3}$ for all $u\in S$ ( $\theta_{3}$ being the $3$ -linear null operator) and if, moreover, there exist $u_{0}\in F$ , $\delta>0,\;b>0$ such that the following relations hold

i. the operator $f^{\prime}\left(u\right)$ has a bounded inverse for all $u\in S$ , and

\big\|f^{\prime}\left(u\right)^{-1}\big\|\leq b;

ii. the numbers $\mu$ and $\nu$ given by $(\ref{2.2})$ satisfy the relations

\rho_{0}=\sqrt{\mu}\left\|f\left(u_{0}\right)\right\|<1

and

\tfrac{\nu\rho_{0}}{\sqrt{\mu}\left(1-\rho_{0}\right)}\leq\delta,

then the following properties hold:

j. $\left(u_{n}\right)_{n\geq 0}$ given by $(\ref{2.1})$ is convergent;

jj. if $\overline{u}=\lim\limits_{u\rightarrow\infty}u_{n},$ then $\overline{u}\in S$ and $f\left(\overline{u}\right)=\theta_{1};$

jjj. $\left\|u_{n+1}-u_{n}\right\|\leq\frac{\nu\rho_{0}^{3^{n}}}{\sqrt{\mu}},\quad n% =0,1,\dots;$

jv. $\left\|\overline{u}-u_{n}\right\|$ $\leq\frac{\nu\rho_{0}^{3^{n}}}{\sqrt{\mu}(1-\rho_{0}^{3^{n}})},\quad n=0,1,\dots$

Proof.

Denote by $g:S\rightarrow F$ the following mapping:

(10)

g\left(u\right)=-\Gamma\left(u\right)f\left(u\right)-\tfrac{1}{2}\Gamma\left(u% \right)f^{\prime\prime}\left(u\right)\big(\Gamma\left(u\right)f\left(u\right)% \big)^{2},

where $\Gamma\left(u\right)=f^{\prime}\left(u\right)^{-1}.$

It can be easily seen that for all $u\in S$ the following identity holds

	$\displaystyle f\left(u\right)+f^{\prime}\left(u\right)g\left(u\right)+\tfrac{1% }{2}f^{\prime\prime}\left(u\right)g^{2}\left(u\right)=$
	$\displaystyle=\tfrac{1}{2}f^{\prime\prime}\left(u\right)\Big[f^{\prime}\left(u% \right)^{-1}f\left(u\right),f^{\prime}\left(u\right)^{-1}f^{\prime\prime}\left% (u\right)\big(f^{\prime}\left(u\right)^{-1}f\left(u\right)\big)^{2}\Big]+$
	$\displaystyle\quad+\tfrac{1}{8}f^{\prime\prime}\left(u\right)\Big[f^{\prime}% \left(u\right)^{-1}f^{\prime\prime}\left(u\right)\big(f^{\prime}\left(u\right)% ^{-1}f\left(u\right)\big)^{2}\Big]^{2}$

whence we obtain

(11)

\big\|f\left(u\right)+f^{\prime}\left(u\right)g\left(u\right)+\tfrac{1}{2}f^{% \prime\prime}\left(u\right)g^{2}\left(u\right)\big\|\leq\tfrac{1}{2}m_{2}^{2}b% ^{4}\left(1+\tfrac{1}{4}m_{0}m_{2}b^{2}\right)\left\|f\left(u\right)\right\|^{% 3},

(12)

\left\|f\left(u\right)+f^{\prime}\left(u\right)g\left(u\right)+\tfrac{1}{2}f^{% \prime\prime}\left(u\right)g^{2}\left(u\right)\right\|\leq\mu\left\|f\left(u% \right)\right\|^{3},\quad\text{ for all }u\in S\text{.}

Similarly, by (10) and taking into account the notations we made, we get

(13)

\left\|g\left(u\right)\right\|\leq\nu\left\|f\left(u\right)\right\|,\text{ for% all }u\in S\text{.}

Using the hypotheses of the theorem, the inequality (12) and the fact that $f^{\prime\prime\prime}\left(u\right)=\theta_{3}$ , we obtain the following inequality:

	$\displaystyle\left\\|f\left(u_{1}\right)\right\\|\leq$	$\displaystyle\left\\|f\left(u_{1}\right)-f\left(u_{0}\right)-f^{\prime}\left(u_% {0}\right)g\left(u_{0}\right)-\tfrac{1}{2}f^{\prime\prime}\left(u_{0}\right)g^% {2}\left(u_{0}\right)\right\\|$
		$\displaystyle+\left\\|f\left(u_{0}\right)+f^{\prime}\left(u_{0}\right)g\left(u_% {0}\right)+\tfrac{1}{2}f^{\prime\prime}\left(u_{0}\right)g^{2}\left(u_{0}% \right)\right\\|$
	$\displaystyle\leq$	$\displaystyle\mu\left\\|f\left(u_{0}\right)\right\\|^{3}.$

Since $u_{1}-u_{0}=g\left(u_{0}\right)$ , by (12) we have

\left\|u_{1}-u_{0}\right\|\leq\nu\left\|f\left(u_{0}\right)\right\|=\tfrac{\nu% \sqrt{\mu}\left\|f\left(u_{0}\right)\right\|}{\sqrt{\mu}}<\tfrac{\nu\rho_{0}}{% \sqrt{\mu}\left(1-\rho_{0}\right)}\leq\delta,

whence it follows that $u_{1}\in S$ .

Suppose now that the following properties hold:

a)

$u_{i}\in S$ , $i=\overline{0,k};$
b)

$\left\|f\left(u_{i}\right)\right\|\leq\mu\left\|f\left(u_{i-1}\right)\right\|^% {3},\;i=\overline{1,k}.$

By the fact that $u_{k}\in S$ , using (12) it follows

(14)

\left\|f\left(u_{k+1}\right)\right\|\leq\mu\left\|f\left(u_{k}\right)\right\|^% {3},

and from relation $u_{k+1}-u_{k}=g\left(u_{k}\right)$

(15)

\left\|u_{k+1}-u_{k}\right\|<\nu\left\|f\left(u_{k}\right)\right\|.

The inequalities b) and (14) lead us to

(16)

\left\|f\left(u_{i}\right)\right\|\leq\tfrac{1}{\sqrt{\mu}}\big(\sqrt{\mu}% \left\|f\left(u_{0}\right)\right\|\big)^{3^{i}},\quad i=\overline{1,k+1}.

We have that $u_{k+1}\in S:$

(17)

\left\|u_{k+1}-u_{0}\right\|\leq\sum\limits_{i=1}^{k+1}\left\|u_{i}-u_{i-1}% \right\|\leq\sum\limits_{i=1}^{k+1}\nu\left\|f\left(u_{i-1}\right)\right\|\leq% \tfrac{\nu}{\sqrt{\mu}}\sum\limits_{i=1}^{k+1}\rho_{0}^{3^{i-1}}\!\leq\!\tfrac% {\nu\rho_{0}}{\left(1-\rho_{0}\right)\sqrt{\mu}}.

Now we shall prove that the sequence $\left(u_{n}\right)_{n\geq 0}$ is Cauchy. Indeed, for all $m,n\in{\mathbb{N}}$ we have

(18)		$\displaystyle\left\\|u_{n+m}-u_{n}\right\\|\leq$	$\displaystyle\sum\limits_{i=0}^{m-1}\left\\|u_{n+i+1}-u_{n+i}\right\\|$
	$\displaystyle\leq$	$\displaystyle\nu\sum\limits_{i=0}^{m-1}\left\\|f\left(u_{n+i}\right)\right\\|$
	$\displaystyle\leq$	$\displaystyle\tfrac{\nu}{\sqrt{\mu}}\sum\limits_{i=0}^{m-1}\rho_{0}^{3^{n+i}}$
	$\displaystyle=$	$\displaystyle\tfrac{\nu}{\sqrt{\mu}}\rho_{0}^{3^{n}}\sum\limits_{i=0}^{m-1}% \rho_{0}^{3^{n+i}-3^{n}}$
	$\displaystyle\leq$	$\displaystyle\tfrac{\nu\rho_{0}^{3^{n}}}{\sqrt{\mu}\left(1-\rho_{0}^{3^{n}}% \right)},$

whence, taking into account that $\rho_{0}<1$ , it follows that $\left(u_{n}\right)_{n\geq 0}$ converges. Let $\overline{u}=\lim_{n\rightarrow\infty}u_{n}.$ Then, for $m\rightarrow\infty$ in (17) it follows jv. The consequence jjj. follows from (15) and (16). ∎

3. The approximation of the eigenvalues and eigenvectors of matrices

In the following we shall apply the previously studied method to the approximation of eigenvalues and eigenvectors of matrices with elements real or complex numbers.

Let $p\in{\mathbb{N}}$ and the matrix $A=\left(a_{ij}\right)_{i,j=\overline{1,p}}$ , where $a_{ij}\in K$ , $i,j=\overline{1,p}$ .

Using the above notations, we shall consider $E=K^{p}$ and $F=K^{p}\times K$ . Any solution of equation

(19)

f\tbinom{x}{\lambda}:=\binom{Ax-\lambda x}{x_{i_{0}}-1}=\tbinom{\theta}{0},

$i_{0}\in\left\{1,...,p\right\}$ being fixed, where $x=\left(x_{1},...,x_{p}\right)\in K^{p}$ and $\theta=\left(0,...,0\right)\in K^{p},$ will lead us to an eigenvalue of $A$ and to a corresponding eigenvector. For the sake of simplicity denote $\lambda=x_{p+1}$ , so that equation 19 is represented by the system

(20)

f_{i}\left(x_{1},...,x_{p},x_{p+1}\right)=0,\quad i=\overline{1,p+1}

where

(21)

f_{i}\left(x_{1},...,x_{p+1}\right)=a_{i1}x_{1}+...+\left(a_{ii}-x_{p+1}\right% )x_{i}+...+a_{ip}x_{p},\quad i=\overline{1,p}

and

(22)

f_{p+1}\left(x\right)=x_{i_{0}}-1.

Denote by $P$ the mapping $P:K^{p+1}\rightarrow K^{p+1}$ defined by relations (21) and (22).

Let $\overline{x}_{n}=\left(x_{1}^{n},...,x_{p+1}^{n}\right)\in K^{p+1}$ . Then the first order Fréchet derivative of the operator $P$ at $\overline{x}_{n}$ has the following form

(23)

P^{\prime}\left(\overline{x}_{n}\right)h\!=\!\!\left(\!\!\!\begin{array}[]{% ccccccc}a_{11}-x_{p+1}^{n}&a_{12}&...&a_{1i_{0}}&...&a_{1p}&-x_{1}^{n}\\ a_{21}&a_{22}-x_{p+1}^{n}\!&...&a_{2i_{0}}&...&a_{2p}&-x_{2}^{n}\\ \vdots&\vdots&&&&\vdots&\vdots\\ a_{p1}&a_{p2}&...&a_{pi_{0}}&...&\!\!a_{pp}-x_{p+1}^{n}&-x_{p}^{n}\\ 0&0&...&1&...&0&0\end{array}\!\!\right)\!\!\left(\!\!\begin{array}[]{c}h_{1}\\ h_{2}\\ \vdots\\ h_{p}\\ \!h_{p+1}\!\end{array}\!\!\right),

where $\overline{h}=\left(h_{1},...,h_{p+1}\right)\in K^{p+1}.$ If we write $\overline{k}=\left(k_{1},...,k_{p+1}\right)\in K^{p+1}$ then for the second order Fréchet derivative we get

(24)

P^{\prime\prime}\left(\bar{x}_{n}\right)\bar{k}\,\overline{h}=\left(\begin{% array}[]{ccccc}-k_{p+1}&0&...&0&-k_{1}\\ 0&-k_{p+1}&...&0&-k_{2}\\ \vdots&\vdots&&\vdots&\vdots\\ 0&0&...&-k_{p+1}&-k_{p}\\ 0&0&...&0&0\end{array}\right)\left(\begin{array}[]{c}h_{1}\\ h_{2}\\ \vdots\\ h_{p}\\ h_{p+1}\end{array}\right).

Denote by $\Gamma\left(\overline{x}_{n}\right)$ the inverse of the matrix attached to the operator $P^{\prime}\left(x_{n}\right)$ and $\overline{u}_{n}=$ $\Gamma\left(\overline{x}_{n}\right)P\left(\overline{x}_{n}\right)=\left(u_{1}^% {n},u_{2}^{n},...,u_{p+1}^{n}\right).$ Let $\overline{v}_{n}=P^{\prime\prime}\left(\overline{x}_{n}\right)\big(\Gamma\left% (\overline{x}_{n}\right)P\left(\overline{x}_{n}\right)\big)^{2}=P^{\prime% \prime}\left(\overline{x}_{n}\right)$ $\overline{u}_{n}^{2}.$ We obtain the following representation

(25)

\overline{v}_{n}=P^{\prime\prime}\left(\overline{x}_{n}\right)\overline{u}_{n}% ^{2}=\left(\begin{array}[]{ccccc}-u_{p+1}^{n}&0&...&0&-u_{1}^{n}\\ 0&-u_{p+1}^{n}&...&0&-u_{2}^{n}\\ \vdots&\vdots&&\vdots&\vdots\\ 0&0&...&-u_{p+1}^{n}&-u_{p}^{n}\\ 0&0&...&0&0\end{array}\right)\left(\begin{array}[]{c}u_{1}^{n}\\ u_{2}^{n}\\ \vdots\\ u_{p}^{n}\\ u_{p+1}^{n}\end{array}\right).

From this relation we can easily deduce the equalities

(26)

\begin{array}[]{l}v_{i}^{n}=-2u_{p+1}^{n}u_{i}^{n},\quad i=\overline{1,p};\\ v_{p+1}^{n}=0.\end{array}

Denoting $\overline{w}_{n}=\left(w_{1}^{n},w_{2}^{n},...,w_{p+1}^{n}\right)=\Gamma\left(% x_{n}\right)\overline{v}_{n}$ and supposing that $\overline{x}_{n}$ is an approximation of the solution of system (20), then the next approximation $\overline{x}_{n+1}$ given by method (8) is obtained by

(27)

\overline{x}_{n+1}=\overline{x}_{n}-\overline{u}_{n}-\tfrac{1}{2}\overline{w}_% {n},\quad i=0,1,\dots

Consider $K^{p}$ with the norm of an element $x=\left(x_{1},...,x_{p}\right)$ given by the equality

(28)

\left\|x\right\|=\max\limits_{1\leq i\leq p}\left\{\left|x_{i}\right|\right\},

and consequently

(29)

\left\|A\right\|=\max\limits_{1\leq i\leq p}\sum\limits_{j=1}^{p}\left|a_{ij}% \right|.

It can be easily seen that $\left\|P^{\prime\prime}\left(\overline{x}_{n}\right)\right\|=2,$ for all $\overline{x}_{n}\in K^{p+1}.$ Let $\overline{x}_{0}\in K^{p+1}$ be an initial approximation of the solution of system (20). Consider a real number $r>0$ and the set $\overline{S}=\left\{x\in K^{p+1}\mid\left\|x-x_{0}\right\|\leq r\right\}$ . Denote by

	$\displaystyle\overline{m}_{0}=$	$\displaystyle\left\\|P\left(\overline{x}_{0}\right)\right\\|+r\left\\|P^{\prime}% \left(\overline{x}_{0}\right)\right\\|+2r^{2},$
	$\displaystyle\overline{\mu}=$	$\displaystyle 2\overline{b}^{4}\big(1+\tfrac{1}{2}\overline{m}_{0}\overline{b}% ^{2}\big),$
	$\displaystyle\bar{\nu}=$	$\displaystyle\overline{b}\big(1+\overline{m}_{0}\overline{b}^{2}\big),$

where $\overline{b}=\sup_{x\in\overline{S}}\left\|\Gamma\left(x\right)\right\|,\,% \Gamma\left(x\right)$ being the inverse of the matrix attached to the operator $P^{\prime}\left(x\right)$ .

Taking into account the results obtained in Section 2, the following consequence holds:

Corollary 2.

If $\overline{x}_{0}\in K^{p+1}$ and $r\in{\mathbb{R}},\,r>0,$ are chosen such that the matrix attached to the operator $P^{\prime}\left(x\right)$ is nonsingular for all $x\in\overline{S}$ , and the following inequalities hold:

\rho_{0}<\sqrt{\overline{\mu}}\left\|P\left(\overline{x}_{0}\right)\right\|<1

\frac{\overline{\nu}\,\overline{\rho}_{0}}{\sqrt{\overline{\mu}}\left(1-% \overline{\rho}_{0}\right)}\leq r

then the following properties are true

j ${}_{\text{1}}$ .

the sequence $\left(\overline{x}_{n}\right)_{n\geq 0}$ generated by $(\ref{3.9})$ is convergent;
jj ${}_{\text{1}}$ .

if $\overline{x}=\lim\limits_{n\rightarrow\infty}\overline{x}_{n}$ , then $P\left(\overline{x}\right)=\theta_{1}=\left(0,...,0\right)\in K^{p+1};$
jjj ${}_{\text{1}}$ .

$\left\|\overline{x}_{n+1}-\overline{x}_{n}\right\|\leq\frac{\overline{\nu}\,% \overline{\rho}_{0}^{3^{n}}}{\sqrt{\overline{\mu}}},\quad n=0,1,\dots;$
jv ${}_{\text{1}}$ .

$\left\|\overline{x}-\overline{x}_{n}\right\|\leq\frac{\overline{\nu}\,% \overline{\rho}_{0}^{3^{n}}}{\sqrt{\overline{\mu}}\left(1-\overline{\rho}_{0}^% {3^{n}}\right)},\quad n=0,1,\dots$

Remark.

If the radius $r$ of the ball $\overline{S}$ is given and there exists $P^{\prime}\left(x_{0}\right)^{-1}$ and $2r\big\|P^{\prime}\left(x_{0}\right)^{-1}\big\|$ $<1$ , then

\big\|P^{\prime}\left(x\right)^{-1}\big\|\leq\frac{\big\|P^{\prime}\left(x_{0}% \right)^{-1}\big\|}{1-2r\big\|P^{\prime}\left(x_{0}\right)^{-1}\big\|},

for all $x\in\bar{S}$ and in the above Corollary, taking into account the proof of Theorem 1, we can take

\overline{b}=\frac{\big\|P^{\prime}\left(x_{0}\right)^{-1}\big\|}{1-2r\big\|P^% {\prime}\left(x_{0}\right)^{-1}\big\|}.

4. A comparison to the Newton method

Note that if in (27) we neglect the term $-\frac{1}{2}\overline{w}_{n}$ , then we get the Newton method:

\overline{x}_{n+1}=\overline{x}_{n}-\overline{u}_{n}=\overline{x}_{n}-\Gamma% \left(\bar{x}_{n}\right)F\left(\bar{x}_{n}\right),\quad n=0,1,...\text{ .}

In order to compare the Newton method and the Chebyshev method, we shall consider that the linear systems which appear in both methods are solved by the Gauss method.

While the Newton method require at each step the solving of a system $Ax=b$ , the Chebyshev method require the solving of two linear systems $Ax=b,\,Ay=c$ with the same matrix $A$ and the vector $c$ depending on the solution $x$ of the first system. So, we shall adapt the Gauss method in order to perform as few as possible multiplications and divisions. When comparing the two methods we shall neglect the number of addition and subtraction operations.

The solving of a given linear system $Ax=b,\,A\in M_{m}\left(K\right),\,b,x\in K^{m}$ , (where we have denoted $m=p+1$ ) using the Gauss method consists in two stages. In the first stage, the given system is transformed into an equivalent one but with the matrix of the coefficients being upper triangular. In the second stage the unknowns $\left(x_{i}\right)_{i=1,m}$ are determined by backward substitution.

The first stage. There are performed $m-1$ steps, at each one vanishing the elements on the same column below the main diagonal.

We write the initial system in the form

\left(\begin{array}[]{ccc}a_{11}^{1}&...&a_{1m}^{1}\\ \vdots&&\vdots\\ a_{m1}^{1}&...&a_{mm}^{1}\end{array}\right)\left(\begin{array}[]{c}x_{1}\\ \vdots\\ x_{m}\end{array}\right)=\left(\begin{array}[]{c}b_{1}^{1}\\ \vdots\\ b_{m}^{1}\end{array}\right).

Suppose that $a_{11}^{1}\neq 0,\;a_{11}^{1}$ being called the first pivote. The first line in the system is multiplied by $\alpha=-\frac{a_{21}^{1}}{a_{11}^{1}}$ and is added to the second one, which becomes $0,a_{22}^{2},a_{23}^{2},...,a_{2m}^{2},b_{2}^{2},$ after performing $m+1$ multiplication or division $\left(M/D\right)$ operations. After $m-1$ such transformations the system becomes

\left(\begin{array}[]{cccc}a_{11}^{1}&a_{12}^{1}&...&a_{1m}^{1}\\ 0&a_{22}^{2}&...&a_{2m}^{2}\\ \vdots&\vdots&&\vdots\\ 0&a_{m2}^{2}&...&a_{mm}^{2}\end{array}\right)\left(\begin{array}[]{c}x_{1}\\ x_{2}\\ \vdots\\ x_{m}\end{array}\right)=\left(\begin{array}[]{c}b_{1}^{1}\\ b_{2}^{2}\\ \vdots\\ b_{m}^{2}\end{array}\right).

Hence at the first step there were performed $\left(m-1\right)\left(m+1\right)$ $M/D$ operations.

In the same manner, at the $k$ -th step we have the system

\left(\begin{array}[]{ccccccc}a_{11}^{1}&a_{12}^{1}&...&a_{1k}^{1}&a_{1,k+1}^{% 1}&...&a_{1m}^{1}\\ 0&a_{22}^{2}&...&a_{2k}^{2}&a_{2,k+1}^{2}&...&a_{2m}^{2}\\ \vdots&\vdots&&\vdots&\vdots&&\vdots\\ 0&0&...&a_{kk}^{k}&a_{k,k+1}^{k}&...&a_{km}^{k}\\ 0&0&...&a_{k+1,k}^{k}&a_{k+1,k+1}^{k}&...&a_{k+1,m}^{k}\\ \vdots&\vdots&&\vdots&\vdots&&\vdots\\ 0&0&...&a_{mk}^{k}&a_{m,k+1}^{k}&...&a_{mm}^{k}\end{array}\right)\left(\begin{% array}[]{c}x_{1}^{\,}\\ x_{2}^{\,}\\ \vdots\\ x_{k}^{\,}\\ x_{k+1}^{\,}\\ \vdots\\ x_{m}^{\,}\end{array}\right)=\left(\begin{array}[]{c}b_{1}^{1}\\ b_{2}^{2}\\ \vdots\\ b_{k}^{k}\\ b_{k+1}^{k}\\ \vdots\\ b_{m}^{k}\end{array}\right).

Supposing the $k$ -th pivote $a_{kk}^{k}\neq 0$ and performing $\left(m-k\right)\left(m-k+2\right)$ $M/D$ operations we get

\left(\begin{array}[]{ccccccc}a_{11}^{1}&a_{12}^{1}&...&a_{1k}^{1}&a_{1,k+1}^{% 1}&...&a_{1m}^{1}\\ 0&a_{22}^{2}&...&a_{2k}^{2}&a_{2,k+1}^{2}&...&a_{2m}^{2}\\ \vdots&\vdots&&\vdots&\vdots&&\vdots\\ 0&0&...&a_{kk}^{k}&a_{k,k+1}^{k}&...&a_{k,m}^{k}\\ 0&0&...&0&a_{k+1,k+1}^{k+1}&...&a_{k+1,m}^{k+1}\\ \vdots&\vdots&&\vdots&\vdots&&\vdots\\ 0&0&...&0&a_{m,k+1}^{k+1}&...&a_{mm}^{k+1}\end{array}\right)\left(\begin{array% }[]{c}x_{1}^{\,}\\ x_{2}^{\,}\\ \vdots\\ x_{k}^{\,}\\ x_{k+1}^{\,}\\ \vdots\\ x_{m}^{\,}\end{array}\right)=\left(\begin{array}[]{c}b_{1}^{1}\\ b_{2}^{2}\\ \vdots\\ b_{k}^{k}\\ b_{k+1}^{k+1}\\ \vdots\\ b_{m}^{k+1}\end{array}\right).

At each step $k$ , the elements below the $k$ -th pivote vanish, so they are not needed any more in the solving of the system.

The corresponding memory in the computer is used keeping in it the coefficients

-\frac{a_{k+1,k}^{k}}{a_{kk}^{k}},...,-\frac{a_{mk}^{k}}{a_{kk}^{k}},

which, of course, will be needed only for solving another system $Ay=c$ , with $c$ depending on $x$ , the solution of $Ax=b$ .

At the first stage there are performed

\left(m-1\right)\left(m+1\right)+...+1\cdot 3=\frac{2m^{3}+3m^{2}-5m}{6}\,% \quad M/D\;\text{operations.}

The second stage. Given the system

\left(\begin{array}[]{cccc}a_{11}^{1}&a_{12}^{1}&...&a_{1m}^{1}\\ 0&a_{22}^{2}&...&a_{2m}^{2}\\ \vdots&\vdots&&\vdots\\ 0&0&...&a_{mm}^{m}\end{array}\right)\left(\begin{array}[]{c}x_{1}\\ x_{2}\\ \vdots\\ x_{m}\end{array}\right)=\left(\begin{array}[]{c}b_{1}^{1}\\ b_{2}^{2}\\ \vdots\\ b_{m}^{m}\end{array}\right)

the solution $x$ is computed in the following way:

\begin{array}[]{l}x_{m}=b_{m}^{m}/a_{mm}^{m},\\ \vdots\\ x_{k}=\big(b_{k}^{k}-(a_{k,k+1}^{k}x_{k+1}+...+a_{km}^{k}x_{m})\big)/a_{kk}^{k% },\\ \vdots\\ x_{1}=\left(b_{1}^{1}-(a_{12}^{1}x_{2}+...+a_{1m}^{1}x_{m})\right)/a_{11}^{1}.% \end{array}

At this stage there are performed $1+2+...+m=\frac{m\left(m+1\right)}{2}\;M/D$ operations. In the both stages, there are totally performed

\frac{m^{3}}{3}+m^{2}-\frac{m}{3}\quad M/D\text{ operations.}

In the case when we solve the systems $Ax=b,\;Ay=c$ , where the vector $c$ depends on the solution $x$ , we first apply the Gauss method for the system $Ax=b$ and at the first stage we keep below the main diagonal the coefficients by which the pivotes were multiplied.

Then we apply to the vector $c$ the transformations performed to the vector $b$ when solving $Ax=b$ .

Denote $c=(c_{i}^{1})_{i=1,m}.$

At the first step

\begin{array}[]{l}c_{2}^{2}:=a_{21}c_{1}^{1}+c_{2}^{1}\\ \vdots\\ c_{m}^{2}:=a_{m1}c_{1}^{1}+c_{m}^{1}.\end{array}

At the $k$ -th step

\begin{array}[]{l}c_{k+1}^{k+1}:=a_{k+1,k}\;c_{k}^{k}+c_{k+1}^{k}\\ \vdots\\ c_{m}^{k+1}:=a_{mk}\;c_{k}^{k}+c_{m}^{k}.\end{array}

At the $m$ -th step

c_{m}^{m}:=a_{m,m-1}c_{m-1}^{m-1}+c_{m}^{m-1}.

There were performed $m-1+m-2+...+1=\frac{m\left(m-1\right)}{2}\;M/D\;$ operations.

Now the second stage of the Gauss method is applied to

\left(\begin{array}[]{ccc}a_{11}^{1}&...&a_{1m}^{1}\\ \vdots&&\vdots\\ 0&...&a_{mm}^{m}\end{array}\right)\left(\begin{array}[]{c}y_{1}\\ \vdots\\ y_{m}\end{array}\right)=\left(\begin{array}[]{c}c_{1}^{1}\\ \vdots\\ c_{m}^{m}\end{array}\right).

In addition to the case of a single linear system, in this case were performed $\frac{m\left(m-1\right)}{2}\;M/D\;$ operations, getting

\tfrac{m^{3}}{3}+\tfrac{3}{2}m^{2}-\tfrac{5}{6}m\;M/D\text{ operations,}

and taking into account $\left(3.8\right)$ we add $\left(m-1\right)$ more $M/D$ operations, obtaining

\tfrac{m^{3}}{3}+\tfrac{3}{2}m^{2}+\tfrac{m}{6}-1\;\text{ }M/D\text{ % operations.}

Remark. At the first stage, if for some $k$ we have $a_{kk}^{k}\approx 0$ , then an element $a_{i_{0},k}^{k}\neq 0,\,i_{0}\in\left\{k+1,...,m\right\}$ must be find, and the lines $i_{0}$ and $k$ in $A$ and $b$ be swapped.

In order to avoid the error accumulations, a partial or total pivote strategy is recommended even if $a_{kk}^{k}\neq 0$ for $k=\overline{1,m-1}$ .

For partial pivoting, the pivote is chosen such that $a_{i_{0},k}^{k}=\max_{i=\overline{k,m}}|a_{ik}^{k}|.$

The interchange of lines can be avoided by using a permutation vector $p=\left(p_{i}\right)_{i=1,m}$ , which is first initialized so that $p_{i}=i$ . The elements in $A$ and $b$ are then referred to as $a_{ij}:=a_{p\left(i\right),j}$ and $b_{i}:=b_{p\left(i\right)}$ , and swapping the lines $k$ and $i_{0}$ is done by swapping the $k$ -th and $i_{0}$ -th elements in $p$ .

For the Chebyshev method, the use of the vector $p$ cannot be avoided by the very interchanging of the lines, because we must keep track for the permutations made, in order to apply them in the same order to the vector $c$ .

Moreover, we need two extra vectors $t$ and $q$ . In $t$ we store the transpositions applied to the lines in $Ax=b,$ which are then successively applied to $q$ . At the first stage of the Gauss method, when the $k$ -th pivote is $a_{i_{0},k}^{k}$ and $i_{0}\neq k$ , the $k$ -th and $i_{0}$ -th elements in $p$ are swapped, and we assign $t_{k}:=i_{0}$ to indicate that at the $k$ -th step we applied to $p$ the transposition $\left(k,i_{0}\right)$ .

After computing the solution of $Ax=b$ , we initialize the vector $c$ by (25), the permutation vector $q$ by $q_{i}:=i,\;i:=\overline{1,m}$ , and then we successively apply the transforms operated to $b$ , taking into account the eventual transpositions.

The algorithm for solving the second linear system in the Chebyshev method is as follows.

for $k:=1\;$ to $\;m-1\;$ do

begin

if $t\left[k\right]<>k$

then {at the $k$ -th step the transposition}

begin { $\left(k,t[k]\right)$ has been applied to $p$ }

$auxi:=q[k];$

$q[k]:=q[t[k]];$

$q[t[k]]:=auxi;$

end,

for $i:=k+1$ to $m$ do $c\left[q[i]\right]:=c\left[q[i]\right]+A\left[q[i],k\right]*c\left[q[k]\right]$

end;

for $i:=m$ downto $1$ do {the solution $y$ is now computed}

begin

$sum:=0;$

for $j:=i+1$ to $m$ do $sum:=sum+A\left[p[i],j\right]*y[j];$ {now $p=q$ }

$y[i]:=\left(c\left[p[i]\right]-sum\right)/A\left[p[i],i\right];$

end.

We adopt as the efficiency measure of an iterative method $M$ the number

E\left(M\right)=\tfrac{\ln q}{s},

where $q$ is the convergence order and $s$ is the number of $M/D$ operations needed at each step.

We obtain

E\left(N\right)=\tfrac{3\ln 2}{m^{3}+3m^{2}-m}

for the Newton method and

E\left(C\right)=\tfrac{6\ln 3}{2m^{3}+9m^{2}+m-6}

for the Chebyshev method.

It can be easily seen that we have $E\left(C\right)>E\left(N\right)$ for $n\geq 2$ , i.e. the Chebyshev method is more efficient than the Newton method.

5. Numerical example

Consider the real matrix

A=\left(\begin{array}[]{rrrr}1&1&1&1\\ 1&1&-1&-1\\ 1&-1&1&-1\\ 1&-1&-1&1\end{array}\right)

which has the following eigenvalues and eigenvectors

\begin{array}[]{c}\lambda_{1,2,3}=2\text{, }x_{1}=\left(1,1,0,0\right),\,x_{2}% =\left(1,0,1,0\right),\,x_{3}=\left(1,0,0,1\right)\text{ and}\\ \lambda_{4}=-2,\,x_{4}=\left(1,-1,-1,-1\right).\end{array}

Taking the initial value $x_{0}=\left(1,-1.5,-2,-1.5,-1\right)$ , and applying the two methods we obtain the following results:

Newton method

$n$	$x_{1}$	$x_{2}$	$x_{3}$	$x_{4}$	$x_{5}=\lambda$
0	1.0	-1.500 000 000 0	-2.000 000 000 0	-1.500 000 000 0	-1.000 000 000 0
1	1.0	-0.900 000 000 0	-0.800 000 000 00	-0.900 000 000 00	-1.600 000 000 0
2	1.0	-1.012 500 000 0	-1.025 000 000 0	-1.012 500 000 0	-2.050 000 000 0
3	1.0	-1.000 152 439 0	-1.000 304 878 0	-1.000 152 439 0	-2.000 609 756 1
4	1.0	-1.000 000 023 2	-1.000 000 046 5	-1.000 000 023 2	-2.000 000 092 9
5	1.0	-1.000 000 000 0	-1.000 000 000 0	-1.000 000 000 0	-2.000 000 000 0

Chebyshev method

$n$	$x_{1}$	$x_{2}$	$x_{3}$	$x_{4}$	$x_{5}=\lambda$
0	1.0	-1.500 000 000 0	-2.000 000 000 0	-1.500 000 000 0	-1.000 000 000 0
1	1.0	-0.972 000 000 00	-0.944 000 000 00	-0.972 000 000 00	-1.888 000 000 0
2	1.0	-0.999 950 001 89	-0.999 900 003 77	-0.999 950 001 89	-1.999 800 007 5
3	1.0	-1.000 000 000 0	-1.000 000 000 0	-1.000 000 000 0	-2.000 000 000 0

References

[1] M.P. Anselone, L.B. Rall, The solution of characteristic value-vector problems by Newton method, Numer. Math. 11 (1968), pp. 38–45.
[2] P.G. Ciarlet, Introduction à l’analyse numérique matricielle et à l’optimisation, Mason, Paris, 1990.
[3] F. Chatelin, Valeurs propres de matrices, Mason, Paris, 1988.
[4] L. Collatz, Functionalanalysis und Numerische Mathematik, Springer-Verlag, Berlin, 1964.
[5] V.S. Kartîşov, F.L. Iuhno, O nekotorîh Modifikaţiah Metoda Niutona dlea Resenia Nelineinoi Spektralnoi Zadaci, J. Vîcisl. matem. i matem. fiz. (33) (1973) 9, pp. 1403–1409.
[6] I. Păvăloiu, Sur les procédés itératifs à un ordre élevé de convergence, Mathematica (Cluj) 12 (35) (1970) 2, pp. 309–324.
[7] J.F. Traub, Iterative Methods for the Solution of Equations, Prentice-Hall Inc., Englewood Cliffs, N. J., 1964.

Received March 7, 1996.

$\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\begin{array}[]{c}\text{% \href http://www.ictp.acad.ro/}\\ \text{Romanian Academy}\\ \text{P.O. Box 68 Cluj-Napoca 1}\\ \text{RO-3400 }\\ \text{Romania}\end{array}$

	$\displaystyle\sup_{u\in S}\left\\|f^{\prime}\left(u\right)\right\\|$	$\displaystyle\leq\left\\|f^{\prime}\left(u_{0}\right)\right\\|+m_{2}\delta,$
	$\displaystyle\sup_{u\in S}\left\\|f\left(u\right)\right\\|$	$\displaystyle\leq\left\\|f\left(u_{0}\right)\right\\|+\delta\left\\|f^{\prime}% \left(u_{0}\right)\right\\|+m_{2}\delta^{2}=:m_{0}.$

	$\displaystyle\left\\|f\left(u_{1}\right)\right\\|\leq$	$\displaystyle\left\\|f\left(u_{1}\right)-f\left(u_{0}\right)-f^{\prime}\left(u_% {0}\right)g\left(u_{0}\right)-\tfrac{1}{2}f^{\prime\prime}\left(u_{0}\right)g^% {2}\left(u_{0}\right)\right\\|$
		$\displaystyle+\left\\|f\left(u_{0}\right)+f^{\prime}\left(u_{0}\right)g\left(u_% {0}\right)+\tfrac{1}{2}f^{\prime\prime}\left(u_{0}\right)g^{2}\left(u_{0}% \right)\right\\|$
	$\displaystyle\leq$	$\displaystyle\mu\left\\|f\left(u_{0}\right)\right\\|^{3}.$

On the Chebyshev method for approximating the eigenvalues of linear operators

Abstract

Authors

Keywords

Cite this paper as:

PDF

About this paper

Journal

Publisher Name

Paper on journal website

Print ISSN

Online ISSN

Google Scholar citations

References

Paper (preprint) in HTML form

On the Chebyshev method for approximating
the eigenvalues of linear operators

1. Introduction

2. The convergence of Chebyshev method

Theorem 1.

Proof.

3. The approximation of the eigenvalues and eigenvectors of matrices

Corollary 2.

Remark.

4. A comparison to the Newton method

5. Numerical example

References

Related Posts