SUPPLEMENTARY DIRECTIONAL RELAXATIONS FOR THE ACCELERATION OF KACZMARZ’S PROJECTION METHOD

. Starting from an extension of Kaczmarz’s method, obtained by us in a previous paper, we introduce new directions for projections. We prove that by this, we don’t modify the set of limit points of the original extended Kaczmarz algorithm. For the class of boundary value problems or integral equations of the ﬁrst kind, we describe a method for constructing these new directions. It is based on considering coarser level of discretization for the initial problem. Some numerical experiments are also presented.

Many important "real-world" problems give rise, after a suitable discretization to linear least-squares formulations as: find x * ∈ R n such that (1) where A is an m×n real matrix and b ∈ R m a given vector (by • we denoted the Euclidean norm and •, • will denote the corresponding scalar product on some space R q ).In many cases, the matrix A is large, sparse (sometimes without a regular positioning of its nonzero entries) and ill-conditioned.Thus, trying to solve (1) by "direct" solvers, based on Singular Value or QR decompositions (see e.g.[4]) is not indicated, more appropriate being the "iterative" algorithms.In this paper we are concerned with a special class of iterative solvers for (1)successive projection algorithms.S. Kaczmarz described for the first time such a method in [6], thus all the further developments and generalizations were usually called "Kaczmarz-like" algorithms (see [1], [2] for an overview in this sense).In [8] (see also [10]) we proposed an extension of the classical Kaczmarz's algorithm, that covers the general inconsistent case for (1).It will be called Kaczmarz Extended projection method (KE, for short) and will be briefly described in what follows.For this, we shall denote by B t , R(B), N (B) the transpose, range and null space of a matrix B, respectively, by P S the orthogonal projection onto a closed convex set S ⊂ R q and by a i ∈ R n , α j ∈ R m the i-th row and j-th column of A, for which we shall suppose that (2) Moreover, all the vectors appearing in the paper will be considered as column vectors.Let The following convergence result was proved in [10].
Theorem 2. For any matrix A satisfying (2) and any vector b ∈ R m the following are true: (i) the sequence x k k≥0 generated by the algorithm KE converges and (6) lim where the n × m matrix G is a generalized inverse of A; (ii) we have the equality where LSS(A; b) is the set of all solutions of (1).
Remark 1.Despite of the fact that some relaxation parameters can be introduced in (5) (see e.g.[9]), the above algorithm KE is not enough fast for becoming an efficient iterative solver.For this, in the next section of the paper we shall emphasize some possible ways to accelerate it, by focussing our attention on a method based on the addition of some supplementary directions for projection in the first and third steps of (5).

EXTENDED KACZMARZ WITH SUPPLEMENTARY PROJECTIONS
A widely used method to accelerate convergence for square nonsingular systems of the form Ax * = b is to (formally) transform it by means of some invertible matrices Q and P as follows ( 8) In the paper [11] we extended this technique to problems of the form (1), by considering the transformed least-squares formulation: find But, although some possible choices exist for Q and P such that ( 9) and (1) are equivalent in some sense, these matrices are forced to satisfy some orthogonality assumptions which restrict the applicability of the method.Another possibility for improving the behavior of algorithm ( 5) is to use, instead of the classical orthogonal projections f i and ϕ j from (3) and (4) oblique or generalized oblique ones, of the form where G and S are symmetric and positive semidefinite matrices, G −1 and S −1 some generalized inverses and • G −1 • S −1 the corresponding "energy" seminorms.These techniques are essentially used in the field of image processing and they usually have a "regularization" property rather than a convergence speed-up (see in this sense [2] and references therein).
In the present paper, we shall consider another possibility to improve convergence properties of the above algorithm KE.It is based on the introduction, in steps 1 and 3 of (5) of supplementary directions for projection.If these directions are properly chosen we can obtain an improvement of the (generalized) condition number of A, which will generate a better behavior of KE.But, there is a very important problem that must be clarified before starting to analyze this technique.Indeed, adding in (5) new directions for projection is (formally) equivalent with adding new rows and/or columns to the matrix A and such kind of transformations will modify the initial problem (1).The surprising result which we shall prove in the rest of this section is that this doesn't happen.More clear, we shall prove that the new algorithm will generate sequences of approximations converging to the solutions of the initial problem (1).For simplifying the notations and the proofs, we shall consider the case of only one direction introduced in steps 1 and 3 from ( 5), but the extension to an arbitrary number is straightforward.Thus, let a 0 ∈ R n , α 0 ∈ R m and b 0 ∈ R be a new row, column and component of the right hand side b, defined as linear combinations of the old ones, i.e. ( 11) We shall define the new projections f 0 (b; •) and ϕ 0 (see ( 3) and ( 4)) by ( 12) and the new applications F 0 (b; •) and Φ 0 by Remark 2. The fact that f 0 and ϕ 0 were introduced after the projections f 1 and ϕ 1 , respectively is not restrictive because the set of solutions for (1) and the set of limit points of the algorithm KE are both invariant under (left-right) orthogonal transformations of A.
According to the above definitions and notations, the new algorithm of the form (5), with supplementary directions a 0 and α 0 for projection is the following.
Algorithm 3 (KE 0 ).Let y 0 = b and x 0 ∈ R n ; for k = 0, 1, 2, . . .do (14) Let now T be the (m+1)×m matrix with the rows (in this order) (11)) and {ε 1 , . . ., ε m } is the canonical basis in R m .Similarly, let S be the (n + 1) × n matrix with the rows (in this order) γ t , τ t 1 , . . ., τ t n , where γ = (γ 1 , . . ., γ n ) t ∈ R n (see again (11)) and {τ 1 , . . ., τ n } is the canonical basis in R n .Then, we define the matrices Ā, Â and the vector b ∈ R m+1 by Remark 3. The matrices Ā, Â and the vector b are of the form Let F ( b; •), Φ be the applications defined as in ( 3) and ( 4), but with respect to the matrices Ā and Â, instead of A. From ( 11)-( 13) we then have Moreover, the algorithm KE 0 can be written in the following form similar to (14) (18) Let also Q, Q, R, Ḡ and Φ be the matrices appearing in the convergence analysis of the algorithm KE from [10], but constructed with respect to Ā and Â, instead of A, respectively.
Lemma 4. The following are true Proof.The first equality holds from (15) and the definition of the matrix S. For the second one, we firstly note that (see [13]) Φ = Φ⊕P N ( Ât ) = Φ⊕P N (A t ) .Then, using the construction of the sequence (y k ) k≥0 in (18) we get i.e. the second equality in (19).Now we are able to prove a convergence result similar with Theorem 1 for the new algorithm KE 0 (see ( 14) or (18)).
Theorem 5.For any matrix A satisfying (2), any vector b ∈ R m and any new elements a 0 , α 0 , b 0 as in (11) the following are true: (i) the sequence x k k≥0 generated by the algorithm KE 0 converges and (ii) we have the equality Proof.(i) For simplifying the presentation, we shall use the notation b A for the vector P R(A) (b).Then, as in [10] we get Then, by using similar arguments as in the proof of Theorem 3.1 from [10] and observing that N ( Ā) = N (A) we get (20).(ii) For a consistent problem of the form (1) we shall denote by S(A; b) the set of its (classical) solutions.Then (see e.g.According to the characterization (7) (see also [10]) we know that x LS = Gb A (where x LS is the unique minimal norm solution of (1)), thus Let LSS 0 (A; b) be the set of all limit points of the algorithm KE 0 (see (20)), i.e. ( Then, from (25)-( 26) we obtain which together with (24) gives (21) and the proof is complete.
Let us now suppose that the problem (1) is consistent, A is with nonzero rows and a 0 , b 0 are as in (11).We then consider the classical Kaczmarz algorithm with supplementary projections: let x 0 ∈ R n ; for k = 0, 1, 2, . . .do (27) x k+1 = F ( b; x k ), with b and F ( b; •) from ( 16) and (17).Then, the following consequence of the above Theorem 3 holds.
Corollary 6.In the above hypothesis, the sequence (x k ) k≥0 generated by the algorithm (27) converges and (28) lim Moreover, we have the equality

THE CONSTRUCTION OF SUPPLEMENTARY DIRECTIONS
Although it is possible to construct supplementary directions for projection in the general case of an inconsistent problem like (1) (see [12]), we shall refer in this section to the consistent case, in which the classical Kaczmarz algorithm (27) applies.We have chosen this case for presentation in the present paper because it refers to a very important class of problems -discretizations of boundary value problems or integral equations of the first kind.In both these cases we start from an initial linear operator equation (30) Lu = f, formulated on a (real) Hilbert space H of functions defined on a given domain Ω ⊂ R q (usually q = 1, 2, 3).If (30) is a boundary value problem we shall refer to a classical finite element or finite differences discretization, whereas it is a first kind integral equation, we shall refer at some Galerkin-like discretization techniques as those described in [7] (see also [3]).The common idea for both cases is to consider an initial set of linearly independent functions {φ 1 , . . ., φ n } ⊂ H and to "project" the (continuous) problem (30) onto the vector space generated by {φ 1 , . . ., φ n }, denoted here by G n .Then, we get the "discrete" formulation of (30): find where (•, •) is the scalar product on H. From (31) we get the associated linear system and x = (x 1 , . . ., x n ) t ∈ R n contains the (unknown) components of the approximate solution u n from (31), with respect to the basis {φ 1 , . . ., φ n }.In most of the cases, the system (33) is big, sparse and ill-conditioned.This will determine a "bad" behavior for both direct and iterative solvers.Kaczmarz's iteration (27) being a "row-action method" (i.e. a method that uses only one row of A at a time, see e.g.[1]) is suitable to be used for big and sparse matrices A, but the ill-conditioning of this will "slow" its convergence.In order to improve it, we can consider new directions for projection which, in our (consistent) case ( 32) is equivalent to "adding" new rows to A and corresponding new components to b, according to (11).In this sense we can use the following idea, coming from the field of "multigrid algorithms" (see [5]) that is, we consider a new set of functions {φ n+1 , . . ., φ m } which are linear combinations of the initial ones, i.e. (34) These functions are obtained by using "coarser" levels of discretization for Ω (see the next section).Then, we "extend" the n × n matrix A and b ∈ R n from (33) to the m × n matrix Ā and b ∈ R m , defined by and we consider the (full-rank "overdetermined") system (36) Āx = b.
Because of (34), (35) and the invertibility of A, it results that the systems (32) and (36) have the same unique solution.Unfortunately, we have not yet a systematic proof for the fact that the matrix Ā is "better conditioned" than the initial one A, but we can send the reader (for at least some comments in this sense) to the paper [5].We shall see in the next section of the paper that such a construction determines a very good improvement of Kaczmarz's algorithm convergence properties.

NUMERICAL EXPERIMENTS
We made our experiments on the following one dimensional convectiondiffusion equation where α ≥ 0 is the convection coefficient.For n = 2 q , q ≥ 1 we considered a uniform discretization of Ω = (0, 1) with the mesh size h = 1 n and the discretization points (38) As discretization functions we used the one dimensional piecewise linear polynomial functions defined by (39) Then, we considered the variational formulation of (37) on the Sobolev space , where by (•, •) 2 we denoted the scalar product in L 2 (Ω).By "projecting" (40) onto the subspace G n−1 ⊂ H 1 0 (Ω) generated by the basis functions φ i from (39) we obtained the linear system (32), with A a square tridiagonal matrix of dimension n − 1 and b ∈ R n−1 given by (41 We solved this system for n = 64 and α = 0 (i.e. the one dimensional Dirichlet problem) with the classical Kaczmarz algorithm (without supplementary projections) and the stopping rule The number of iterations for fulfilling (40) was greater than 10 000.Moreover, the spectral condition number of A was k 2 (A) = 1 659.4.Then, we constructed the new directions for projection as follows -we considered a "coarser" discretization of Ω = (0, 1) with the mesh size 2h, i.e.
Coming back to our particular case n = 64 we considered three coarser levels (2h, 4h and 8h) and we added at A and b the corresponding new rows and components constructed as before.In this way we obtained a matrix Ā of dimensions 116 × 64 and a corresponding right hand side b.We solved the system (36) with the Kaczmarz algorithm (27).The number of iterations for fulfilling (40) was 317 which is a much better value than the one obtained before.Moreover, the "generalized" spectral condition number of Ā (defined as the square root of the ratio between the biggest and smallest singular values of it, see e.g.[4]) was k 2 ( Ā) = 431.7,which is also much better than the previous k 2 (A).
Remark 5.Although the results are very "promising", we have to observe that the improvement obtained before for Kaczmarz's algorithm is not meshindependent, i.e. the number of iterations for solving the extended system (36) still depends on the dimension n of the initial system (32) (see also [5]).Moreover, it also depends on the number of "coarser" levels used for the construction of the new directions.Work is in progress for eliminating also these aspects.
Note.All the numerical experiments have been performed with the Numerical Linear Algebra software package OCTAVE (freely available on Internet).