On approximating the inverse of a matrix

8 years ago

Abstract

In this note we deal with two problems: the first regards the efficiency in approximating the inverse of a matrix by the Schultz-type methods, and the second is the problem of evaluating the errors in the approximation of the inverses of the perturbed matrices.

Author

Ion Păvăloiu
(Tiberiu Popoviciu Institute of Numerical Analysis)

Keywords

matrix inverse; iterative methods; Schultz method; perturbations; error evaluation.

PDF

PDF file.

Cite this paper as:

I. Păvăloiu, On approximating the inverse of a matrix, Creative Math., 12 (2003), pp. 15-20.

About this paper

Journal

Creative Mathematics

Publisher Name

Universitatea Tehnica din Cluj-Napoca

Article on the journal website

https://www.creative-mathematics.cunbm.utcluj.ro/article/on-approximating-the-inverse-of-matrix/

Print ISSN

1584-286X

Online ISSN

1843-441X

References

[1] Herzberger J., Explizite Shulz Verfahren hoherer ordrnung zur approximation der reversen matrix, Z. Angew Math. und Mech. 1988, Bd. 68, No. 5, pp. 494-496

[2] Ostrowski M.A., Solution of equations in euclidian and Banach spaces, Academic Press. New York and London (1975)

[3] Stickel E., On a class of high order methods for investing matrices, Z. Angew Math. und Mech. 1987, Bd. 67, No. 7, pp. 334-336

Paper (preprint) in HTML form

2003-094-Pavaloiu-Creative-Math.-On-approx-the-inv-matrix-15-07-15

On approximating the inverse of a matrix

Ion Păvăloiu

Abstract

In this note we deal with two problems: the first regards the efficiency in approximating the inverse of a matrix by the Shulz-type methods, and the second is the problem of evaluating the errors in the approximation of the inverses of the perturbed matrices.

1. Introduction

In this note we deal with two problems: the first regards the efficiency in approximating the inverse of a matrix by the Shulz-type methods, and the second is the problem of evaluating the errors in the approximation of the inverses of the perturbed matrices.

As it is well known, given a nonsingular matrix

A \in R^{m \times m}

and a matrix

D_{0} \in R^{m \times m}

such that

\begin{matrix} (1) & ‖ I - A D_{0} ‖ \leq q < 1 \end{matrix}

with

q \in R

and

I

the

m

-th order unit matrix, then, for

k \in N, k \geq 2

fixed, the sequence of matrices

{(D_{n})}_{n \geq 0}

given by

\begin{aligned} (2) & F_{n} & = I - A D_{n} \\ D_{n + 1} & = D_{n} (I + F_{n} + F_{n}^{2} + \dots + F_{n}^{k - 1}), n = 0, 1, \dots \end{aligned}

is convergent and

lim_{n \to \infty} D_{n} = A^{- 1}

. Moreover,

{(F_{n})}_{n \geq 0}

verifies

\begin{matrix} (3) & F_{n + 1} = F_{n}^{k}, n = 0, 1, \dots \end{matrix}

The methods of type (2) represent generalizations of the well known Shulz method. Relation (3) shows that the convergence order of sequence

{(D_{n})}_{n \geq 0}

is

k, k \geq 2

.

We introduce the notion of efficiency index of method (2). We notice that at each iteration step, the number of the matrix sums required is equal to the number of matrix products which appear in (2). Moreover, for computing the sum

I + F_{n} + \dots + F_{n}^{k}

we may use a method similar to the Horner scheme, i.e.

\begin{matrix} (4) & I + F_{n} + F_{n}^{2} + \dots + F_{n}^{k - 1} = {{[(F_{n} + I) F_{n} + I] F_{n} + I} + \dots} \end{matrix}

In this way the matrix sums required reduce to sums in which one term is the identity matrix. This remark is also valid for the term

F_{n} = I - A D_{n}

. The operation consisting of one matrix product and one matrix sum (regardless of their order) we call it computing unit.

Definition 1. The efficiency index of method (2) is given by

\begin{matrix} (5) & E_{k} = k^{1 / s}, \end{matrix}

where

s \in N

represents the number of computing units required at each iteration step of method (2).

This definition is given by analogy to the efficiency index introduced by A.M. Ostrowski in [2]. The definition may also be motivated by the following reasoning.

From (3) it follows

\begin{matrix} (6) & ‖ F_{n + 1} ‖ \leq {‖ F_{n} ‖}^{k}, n = 0, 1, \dots \end{matrix}

whence

\begin{matrix} (7) & ‖ F_{n + 1} ‖ \leq {‖ F_{0} ‖}^{k^{n + 1}}, n = 0, 1, \dots \end{matrix}

The above inequalities lead to the following error bounds:

\begin{matrix} (8) & ‖ A^{- 1} - D_{n} ‖ \leq ‖ A^{- 1} ‖ {‖ F_{0} ‖}^{k^{n}}, n = 0, 1, \dots \end{matrix}

Consider now two methods of type (2), having the convergence orders

k_{1}

and

k_{2}

respectively. Assume that, for achieving the same precision, these methods require

n_{1}

respectively

n_{2}

iteration steps. Then (8) implies

\begin{matrix} (9) & k_{1}^{n_{1}} = k_{2}^{n_{2}} \end{matrix}

The total number of computing units is

n_{1} s_{1}

in the first case and

n_{2} s_{2}

in the second case.

It is clear now that the method with convergence order

k_{1}

is more efficient than the other if

\begin{matrix} (10) & n_{1} s_{1} < n_{2} s_{2} \end{matrix}

Relations (9) and (10) lead us to

\begin{matrix} (11) & k_{1}^{1 / s_{1}} > k_{2}^{1 / s_{2}} \end{matrix}

Taking into account Definition 1, it follows that among the methods of type (2) for different values of

k

, the most efficient is given by the one with high efficiency index.

We shall determine in the following section the optimal method, i.e., having the high efficiency index, when

k \in N, k \geq 2

.

2. Optimal efficiency index

Assume that we use (4) at each iteration step in (2). It can be easily seen that for the sum in (4) there are needed

k - 2

matrix products. Relation (2) shows that 2 more matrix products are required at each iteration step, so in total we need

k

matrix products.

Taking into account (5), it follows that the efficiency index of method (2) is given by

\begin{matrix} (12) & {\bar{E}}_{k} = k^{1 / k} \end{matrix}

Considering the function

f : (0, + \infty) \to R, f (x) = x^{\frac{1}{x}}

, it can be easily seen that this function attains a maximum value at

x = e

. Since

f

is increasing on (

0, e

) and decreasing on (

e, + \infty

), it follows that

{\bar{E}}_{k}

is the largest for

k = 3

.

We have proved the following result.
Theorem 2. Among the methods (2) for

k = N, k \geq 2

, the method with highest efficiency index is given by:

\begin{matrix} (13) & {\begin{cases} F_{n} = I - A D_{n} \\ D_{n + 1} = D_{n} (I + F_{n} + F_{n}^{2}), n = 0, 1, \dots \end{cases} \end{matrix}

with

D_{0}

verifying

‖ I - A D_{0} ‖ \leq q < 1

.
By (4), the above method may be written as

\begin{aligned} (14) & F_{n} & = I - A D_{n} \\ D_{n + 1} & = D_{n} [(F_{n} + I) F_{n} + I], n = 0, 1, \dots \end{aligned}

In this case, (7) becomes

\begin{matrix} (15) & ‖ F_{n + 1} ‖ \leq {‖ F_{0} ‖}^{3^{n + 1}}, n = 0, 1, \dots \end{matrix}

and for the error bounds one has

\begin{matrix} (16) & ‖ A^{- 1} - D_{n} ‖ \leq ‖ A^{- 1} ‖ ‖ F_{n} ‖ \leq ‖ A^{- 1} ‖ {‖ F_{0} ‖}^{3^{n}}, n = 0, 1, \dots . \end{matrix}

It can be easily seen that under (1), one has the inequality

\begin{matrix} (17) & ‖ A^{- 1} ‖ \leq \frac{‖ D_{0} ‖}{1 - ‖ F_{0} ‖} \end{matrix}

whence

\begin{matrix} (18) & ‖ A^{- 1} - D_{n} ‖ \leq ‖ D_{0} ‖ \frac{{‖ F_{0} ‖}^{3^{n}}}{1 - ‖ F_{0} ‖}, n = 0, 1, \dots \end{matrix}

Analogously, for any method of type (2) one may deduce the evaluation

\begin{matrix} (19) & ‖ A^{- 1} - D_{n} ‖ \leq ‖ D_{0} ‖ \frac{{‖ F_{0} ‖}^{k^{n}}}{1 - ‖ F_{0} ‖}, n = 0, 1, \dots \end{matrix}

3. Error bounds in case of perturbed matrices

In practice, the elements of the matrix

A

are usually obtained as results of certain experiments, measurements, approximations etc. Therefore their values are altered by errors. Consequently we replace

A

by the approximation

\tilde{A}

. For a rigorous interpretation of the results, it is necessary to know an error bound

ε > 0

for which

\begin{matrix} (20) & ‖ A - \tilde{A} ‖ \leq ε \end{matrix}

Instead of sequence

{(D_{n})}_{n \geq 0}

we consider

{({\tilde{D}}_{n})}_{n \geq 0}

, generated by

\begin{aligned} (21) & {\tilde{F}}_{n} & = I - \tilde{A} {\tilde{D}}_{n} \\ {\tilde{D}}_{n + 1} & = {\tilde{D}}_{n} (I + {\tilde{F}}_{n} + {\tilde{F}}_{n}^{2} + \dots + {\tilde{F}}_{n}^{k - 1}), n = 0, 1, \dots \end{aligned}

We assume that the matrices

\tilde{A}

and

{\tilde{D}}_{0}

above obey

\begin{matrix} (22) & ‖ I - \tilde{A} {\tilde{D}}_{0} ‖ \leq \bar{q} < 1 \end{matrix}

It follows that

\tilde{A}

is invertible:

\exists {\tilde{A}}^{- 1}

and by (18) we get

\begin{matrix} (23) & ‖ {\tilde{A}}^{- 1} - {\tilde{D}}_{n} ‖ \leq ‖ {\tilde{D}}_{0} ‖ \frac{{‖ {\tilde{F}}_{0} ‖}^{k^{n}}}{1 - ‖ {\tilde{F}}_{0} ‖}, n = 0, 1, \dots \end{matrix}

We are interested in conditions which ensure that

\tilde{A}

is nonsingular. We consider the identity

I - {\tilde{A}}^{- 1} A = {\tilde{A}}^{- 1} (\tilde{A} - A)

which implies

‖ I - {\tilde{A}}^{- 1} A ‖ \leq ‖ {\tilde{A}}^{- 1} ‖ ε

whence, by (17) we get

\begin{matrix} (24) & ‖ I - {\tilde{A}}^{- 1} A ‖ \leq \frac{ε ‖ {\tilde{D}}_{0} ‖}{1 - ‖ {\tilde{F}}_{0} ‖} \end{matrix}

This relation shows that for the existence of the inverse for

{\tilde{A}}^{- 1} A

it suffices that

\begin{matrix} (25) & r = \frac{ε ‖ {\tilde{D}}_{0} ‖}{1 - ‖ {\tilde{F}}_{0} ‖} < 1 \end{matrix}

whence for

ε

we get the condition

\begin{matrix} (26) & ε < \frac{1 - ‖ F_{0} ‖}{‖ {\tilde{D}}_{0} ‖} \end{matrix}

Further,

A^{- 1} = {({\tilde{A}}^{- 1} A)}^{- 1} {\tilde{A}}^{- 1}

whence

‖ A^{- 1} ‖ \leq ‖ {\tilde{A}}^{- 1} ‖ ‖ {\tilde{A}}^{- 1} A ‖ \leq \frac{‖ {\tilde{D}}_{0} ‖}{1 - ‖ {\tilde{F}}_{0} ‖ - ε ‖ {\tilde{D}}_{0} ‖}

and (26) attracts

1 - ‖ {\tilde{F}}_{0} ‖ - ε ‖ {\tilde{D}}_{0} ‖ > 0

.
The following inequality can be easily proved

\begin{matrix} (27) & ‖ A^{- 1} - {\tilde{A}}^{- 1} ‖ \leq \frac{{‖ {\tilde{D}}_{0} ‖}^{2} ε}{(1 - ‖ {\tilde{F}}_{0} ‖) (1 - ‖ {\tilde{F}}_{0} ‖ - ε ‖ {\tilde{D}}_{0} ‖)} \end{matrix}

which, together with (23) leads to

\begin{matrix} (28) & ‖ A^{- 1} - {\tilde{D}}_{n} ‖ \leq \frac{‖ {\tilde{D}}_{0} ‖}{1 - ‖ {\tilde{F}}_{0} ‖} [\frac{‖ {\tilde{D}}_{0} ‖ ε}{1 - ‖ {\tilde{F}}_{0} ‖ - ε ‖ {\tilde{D}}_{0} ‖} + {‖ F_{0} ‖}^{t^{n}}], n = 0, 1, \dots \end{matrix}

This inequality provides a priori evaluations for the error. If we want to stop the iterations at a certain step

\bar{n}

such that

‖ {\tilde{F}}_{\bar{n}} ‖ \leq ε_{1}, ε_{1} > 0

given, then by (7) and (17) it follows

‖ {\tilde{A}}^{- 1} - {\tilde{D}}_{\bar{n}} ‖ \leq \frac{‖ {\tilde{D}}_{0} ‖}{1 - ‖ {\tilde{F}}_{0} ‖} ε_{1}

which, together with (27) lead to

‖ A^{- 1} - {\tilde{D}}_{\bar{n}} ‖ \leq \frac{‖ {\tilde{D}}_{0} ‖}{1 - ‖ {\tilde{F}}_{0} ‖} [ε_{1} + \frac{ε ‖ {\tilde{D}}_{0} ‖}{1 - ‖ {\tilde{F}}_{0} ‖ - ε ‖ {\tilde{D}}_{0} ‖}]

which is an a posteriori error bound.

References

[1] Herzberger J., Explizite Shulz Verfahren höherer ordrnung zur approximation der reversen matrix, Z. Angew Math. und Mech. 1988, Bd. 68, No. 5, pp. 494-496
[2] Ostrowski M.A., Solution of equations in euclidian and Banach spaces, Academic Press. New York and London (1975)
[3] Stickel E., On a class of high order methods for investing matrices, Z. Angew Math. und Mech. 1987, Bd. 67, No. 7, pp. 334-336
"T. Popoviciu" Institute of Numerical Analysis
Str. Fântânele, nr.57, Bloc B7, sc.II, etaj 5, ap. 67-68
Cluj-Napoca, Romania
E-mail address: pavaloiu@ictp.acad.ro

Received: 28.02.2003; In revised form: 30.10.2003
Key words and phrases. Inverse of a matrix, perturbed matrix, efficiency in approximating the inverse of a matrix.

2003

On approximating the inverse of a matrix

Abstract

Author

Keywords

PDF

Cite this paper as:

About this paper

Journal

Publisher Name

Article on the journal website

Print ISSN

Online ISSN

References

Paper (preprint) in HTML form

On approximating the inverse of a matrix

Abstract

1. Introduction

2. Optimal efficiency index

3. Error bounds in case of perturbed matrices

References

Related Posts