Enlarging the convergence ball
of Newton’s method on Lie groups

Ioannis K. Argyros\(^\ast \) Saïd Hilout\(^\S \)

February 16, 2013.

\(^\ast \)Department of Mathematical Sciences, Cameron University, Lawton, OK 73505, USA, e-mail: ioannisa@cameron.edu.

\(^\S \)Poitiers University, Laboratoire de Mathématiques et Applications, Bd. Pierre et Marie Curie, Téléport 2, B.P. 30179, 86962 Futuroscope Chasseneuil Cedex, France, e-mail: said.hilout@math.univ-poitiers.fr.

Dedicated to prof. I. Păvăloiu on the occasion of his 75th anniversary

We present a local convergence analysis of Newton’s method for approximating a zero of a mapping from a Lie group into its Lie algebra. Using more precise estimates than before [ 55 , 56 ] and under the same computational cost, we obtain a larger convergence ball and more precise error bounds on the distances involved. Some examples are presented to further validate the theoretical results.

MSC. 65G99, 65J15, 47H17, 49M15, 90C90.

Keywords. Newton’s method, Lie group, Lie algebra, Riemannian manifold, convergence ball, Kantorovich hypothesis.

1 Introduction

In this paper, we are concerned with the problem of approximating a zero \( x^\star \) of \(\mathcal{C}^1\)-mapping \(F \, : \, G \longrightarrow Q\), where \(G\) is a Lie group and \(Q\) the Lie algebra of G that is the tangent space \(T_e \, G\) of \(G\) at \(e\), equipped with the Lie bracket \([.,.] \, : \, Q\times Q \longrightarrow Q\) [ 7 , 21 , 28 , 29 , 53 ] .

The study of numerical algorithms on manifolds for solving eigenvalue or optimization problems on Lie groups [ 1 ] [ 12 ] , [ 33 ] [ 36 ] , [ 54 ] [ 57 ] is very important in Computational Mathematics. Newton-type methods are the most popular iterative procedures used to solve equations, when these equations contain differentiable operators. The study about convergence matter of Newton’s method is usually centered on two types: semilocal and local convergence analysis. The semilocal convergence matter is, based on the information around an initial point, to give criteria ensuring the convergence of Newton’s method; while the local one is, based on the information around a solution, to find estimates of the radii of convergence balls. There is a plethora of studies on the weakness and/or extension of the hypothesis made on the underlying operators; see for example (cf. [ 7 , 11 , 16 ] and references theiren). A local as well as a semilocal convergence of Newton-type methods has been given by several authors under various conditions [ 2 ] [ 58 ] . Recently in [ 15 ] , we presented a finer semilocal convergence analysis for Newton’s method than in earlier studies [ 7 , 32 , 34 , 36 , 54 , 55 , 56 ] .

Newton’s method with initial point \(x_0 \in G\) was first introduced by Owren and Welfert [ 43 ] in the form

\begin{equation} \label{1.1} x_{n+1} = x_n \cdot {\rm exp} \, (-d F_{x_n}^{-1} \, F(x_n)) \quad {\rm for \, \, each} \quad n=0,1,\ldots . \end{equation}
1.1

Newton’s method is undoubtedly the most popular method for generating a sequence \(\{ x_n \} \) approximating \(x^\star \) [ 7 , 11 , 15 , 16 , 32 , 34 , 36 ] , [ 54 ] [ 56 ] . In the present paper we establish a finer local convergence analysis with the advantages (\(\mathcal{A}_l\)):

  1. Larger convergence ball;

and

  1. Tighter error bounds on the distances involved.

The necessary background on Lie groups can be found in [ 7 , 11 , 15 , 16 ] and the references therein. The paper is organized as follows. In Section 2, we present the local convergence analysis of Newton’s method. Finally, numerical examples are given in the concluding Section 3.

2 Local convergence analysis for Newton’s method

We shall study the semilocal/local convergence of Newton’s method. In the rest of the paper we assume \(\langle \cdot , \cdot \rangle \) the inner product and \(\parallel \cdot \parallel \) on \(Q\). As in [ 56 ] we define a distance on \(G\) for \(x, y \in G\) as follows:

\begin{align} \label{3.1} m(x,y) & = \inf \bigg\{ \textstyle \sum \limits _{i=1}^{k} \parallel z_i\parallel \, : \, {\rm there \, \, exist } \, \, k \geq 1 \, \, {\rm and } \, \, z_1,\cdots , z_k \in Q \\ & \quad {\rm such \, \, that} \, \, y= x \cdot \exp \, z_1 \cdots \exp \, z_k \bigg\} \nonumber . \end{align}

By convention \(\inf \, \emptyset = +\infty \). It is easy to see that \(m(\cdot , \cdot )\) is a distance on \(G\) and the topology induced is equivalent to the original one on \(G\). Let \(w \in G\) and \(r {\gt}0\), we denote by

\[ U(w,r)= \{ y \in G \, : \, m(w,y) {\lt} r \} \]

the open ball centered at \(w\) and of radius \(r\). Moreover, we denote the closure of \(U(w,r)\) by \(\overline{U} (w, r) \). Let also \(L(Q)\) denotes the set of all linear operators on \(Q\).

We need the definition of Lipschitz continuity for a mapping.

Definition 2.1

Let \(M \, : \, G \longrightarrow L (Q)\), \(x^\star \in G\) and \(r{\gt}0\). We say that \(M\) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , r)\) if

\begin{equation} \label{3.2} \parallel M (x \cdot \exp \, u) - M(x) \parallel \leq L \, \parallel u \parallel \end{equation}
2.3

holds for any \(u \in Q\) and \(x \in U (x^\star ,r)\) such that \(\parallel u \parallel + m (x , x^\star ) {\lt} r\) .

It follows from (2.3) that there exists \(L_{\star }\) such that

\begin{equation} \label{3.3} \parallel M (x \cdot \exp \, u) - M(x^\star ) \parallel \leq L _{\star } \, ( \parallel u \parallel + m(x ,x^\star )) \end{equation}
2.4

holds for any \(u\in Q\) and \(x \in U (x^\star ,r)\) such that \(\parallel u\parallel + m(x, x^\star ) {\lt} r\).

We then say \(M\) satisfies the \(L_{\star }\)-center Lipschitz condition at \(x^\star \in G\) on \(U(x^\star ,r)\). Note that if \(M\) satisfies the \(L\)-Lipschitz condition on \(U(x^\star ,r)\), then it also satisfies the \(L _{\star }\)-center Lipschitz condition at \(x^\star \in G\) on \(U (x^\star ,r)\). Clearly,

\begin{equation} \label{3.4} L_{\star } \leq L \end{equation}
2.5

holds in general and \({L}/{L_{\star }}\) can be arbitrarily large [ 4 ] [ 16 ] .

Let us show that (2.3) indeed implies (2.4). If (2.3) holds, then for \(x=x^\star \) there exists \(K \in (0,L]\) such that

\[ \parallel M (x^\star \cdot \exp \, u) - M(x^\star ) \parallel \leq K \, \parallel u \parallel \quad {\rm for \, \, any} \quad u \in Q \quad {\rm such \, \, that} \quad \parallel u \parallel {\lt} r . \]

There exist \(k\geq 1\) and \(z_0, z_1, \cdots , z_k \in Q\) such that \(x=x^\star \cdot \exp \, z_0 \cdots \exp \, z_k \). Let \(y_0=x^\star \), \(y_{i+1} = y_i \cdot \exp \, u_i\), \(i=0,1, \cdots , k\). Then, we have that \(x=y_{k+1} \). We can write the identity

\[ \begin{array}{l} M (x \cdot \exp \, u) - M(x^\star ) =\\ =(M (x \cdot \exp \, u) - M(x )) + (M (x ) - M(x^\star )) \\ = (M (x \cdot \exp \, u) - M(x )) + (M (y_k \cdot \exp \, u_k ) - M(y_k )) + (M(y_k) - M(x^\star )) \\ = (M (x \cdot \exp \, u) - M(x )) + (M (y_k \cdot \exp \, u_k ) - M(y_k )) \\ \quad + (M (y_{k-1} \cdot \exp \, u_{k-1}) - M(x^\star )) \\ = (M (x \cdot \exp \, u) - M(x )) + (M (y_k \cdot \exp \, u_k ) - M(y_k )) + \cdots + \\ \quad +(M (y_{1} \cdot \exp \, u_{1}) - M(y_1 )) + (M(x^\star \cdot \exp \, u_0 ) - M(x^\star )) . \end{array} \]

Using (2.3) and the triangle inequality, we obtain in turn that

\begin{align*} & \parallel M (x \cdot \exp \, u) - M(x^\star ) \parallel \leq \\ & \leq \parallel M (x \cdot \exp \, u) - M(x )\parallel + \parallel M (y_k \cdot \exp \, u_k ) - M(y_k ) \parallel + \cdots \\ & \quad + \parallel M (y_{1} \cdot \exp \, u_{1}) - M(y_1 ) \parallel + \parallel M(x^\star \cdot \exp \, u_0 ) - M(x^\star )\parallel \\ & \leq L \, (\parallel u \parallel + \parallel u_k \parallel + \cdots + \parallel u_1 \parallel ) + K \, \parallel u_0 \parallel \\ & = L \, (\parallel u \parallel + \parallel u_k \parallel + \cdots + \parallel u_0 \parallel ) + ( K - L ) \, \parallel u_0 \parallel \\ & \leq L \, (\parallel u \parallel + \parallel u_k \parallel + \cdots + \parallel u_0 \parallel ), \end{align*}

which implies (2.4).

We need a Banach type lemma on invertible mappings.

Lemma 2.2

Suppose that \(dF_{x^\star }^{-1}\) exists and let \(0 {\lt} r \leq \tfrac {1}{L_{\star }}\). Suppose \(dF_{x^\star }^{-1} \, dF\) satisfies the \(L_{\star }\)-center Lipschitz condition at \(x^\star \in G\) on \(U (x^\star ,r)\); for \(x \in U (x^\star ,r)\), there exist \(k \geq 1\) and \(z_0, z_1, \cdots , z_k \in Q\), such that \(x= x^\star \cdot {\rm exp} \, z_0 \cdots {\rm exp} \, z_k\) and \(\textstyle \sum \limits _{i=0}^{k} \parallel z_i \parallel {\lt} r\). Then, linear mapping \(dF_{x}^{-1}\) exists and

\begin{equation} \label{3.5} \parallel dF_{x}^{-1} \, dF_{x^\star } \parallel \leq {\bigg(} 1- L_{\star } \, \textstyle \sum \limits _{i=0}^{k} \parallel z_i \parallel {\bigg)} ^{-1}. \end{equation}
2.6
Proof â–¼
Let \(y_{k} = x^\star \cdot \exp \, z_0 \cdot \exp \, z_1 \, \cdots \, \exp \, z_{k-1} \). Then, we deduce \(y_k \in U (x^\star , r)\), since \(\textstyle \sum \limits _{i=0} ^{k-1} \parallel z_i \parallel {\lt} r\). Note that \(x= y_k \cdot \exp \, z_k\). Using (2.4) for \(M= dF_{x^\star }^{-1} \, dF \), we get in turn

\begin{equation} \label{3.6} \parallel dF_{x^\star }^{-1} \, (dF_{x} - dF_{x^\star }) \parallel \leq L_{\star } \, (\parallel z _k \parallel + m(y_k , x^\star )) \leq L_{\star } \, \textstyle \sum \limits _{i=0}^{k} \parallel z_i \parallel < L_{\star } \, r \leq 1 . \end{equation}
2.7

It follows from (2.7) and the Banach lemma on invertible operators [ 7 ] , [ 32 ] that \(dF_x^{-1}\) exists and (2.6) holds. That completes the proof of Lemma 2.2.

Proof â–¼

Next, we study the convergence domain of Newton’s method around a zero \(x^\star \) of mapping \(F\). First from Lemma 2.3 until Corollary 2.9, we present the local result for Newton’s method when \(G\) is an Abelian group. Then, the corresponding local results follow when \(G\) is not necessarily an Abelian group.

Lemma 2.3

Let \(G\) be an Abelian group and \(0{\lt} r \leq \tfrac {1}{L_{\star }}\). Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1}\) exists and let \( L_{\star }{\gt}0\); there exist \(j \geq 1\) and \(z_1, z_2, \cdots , z_j \in Q\) such that

\begin{equation} \label{3.19} x_0 = x^\star \cdot {\rm exp} \, z_1 \cdots {\rm exp} \, z_j \quad for \quad \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel < r ; \end{equation}
2.8

\(dF_{x^\star }^{-1} \, dF\) satisfies the \(L_{\star }\)-center Lipschitz condition at \( x^\star \) on \(U (x^\star ,r )\). Then, linear mapping \(dF _{x_0} ^{-1}\) exists and

\begin{equation} \label{3.20} \parallel dF _{x_0} ^{-1} \, F(x_0) \parallel \leq \displaystyle \tfrac { \bigg(2 + L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg) \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel }{2 \, \bigg(1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg)} . \end{equation}
2.9
Proof â–¼
Using hypothesis \(dF_{x^\star }^{-1} \, dF\) satisfies the \(L_{\star }\)-center Lipschitz condition at \(x^\star \) on \(U (x^\star , \displaystyle \frac{1}{L_{\star }})\), we have that

\begin{equation} \label{3.21-nov} \parallel dF_{x^\star }^{-1} \, (dF_{x \cdot {\rm exp} \, u} - dF_{x^\star } ) \parallel \leq L_{\star } \, ( \parallel u \parallel + m(x, x^\star )) \end{equation}
2.10

for all \(u \in Q\), \( x \in U (x^\star , r)\) such that \(\parallel u \parallel + m(x, x^\star ) {\lt} r\). Let \(z=z_1+z_2+ \cdots + z_j\). Then, since \(G\) is an Abelian group, \({\rm exp} \, z_1 \cdot {\rm exp} \, z_2 \cdots {\rm exp} \, z_j = {\rm exp} \, (z_1+z_2+ \cdots +z_j) = {\rm exp} \, z\), so, we can write \(x_0 = x^\star \cdot {\rm exp} \, z\). It then follows from Lemma 2.2 that \(dF_{x_0}^{-1}\) exists and

\begin{equation} \label{3.21} \parallel dF _{x_0} ^{-1} \, dF_{x^\star }) \parallel \leq \bigg(1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg) ^{-1} . \end{equation}
2.11

We also have the following identity

\begin{equation} \label{3.23-nov} \begin{array}{lll} F(x_0)& =& F(x^\star \cdot {\rm exp} \, z) - F(x^\star ) \\ & =& \displaystyle \int _0 ^1 dF _{x^\star \cdot {\rm exp} \, (tz)} \, z\, dt = \displaystyle \int _0 ^1 (dF _{x^\star \cdot {\rm exp} \, (t z)} - dF_{x^\star } ) \, z \, dt + dF_{x^\star } \, z . \end{array} \end{equation}
2.12

In view of (2.10) and (2.12), we get that

\begin{equation} \label{3.22} \begin{array}{lll} \parallel dF _{x^\star } ^{-1} \, F (x_0) \parallel & \leq & \displaystyle \int _0 ^1 \parallel dF _{x^\star } ^{-1} \, (dF_{x^\star \cdot {\rm exp} \, (t\, z)} - dF _{x^\star } )\parallel \, \, \parallel z \parallel \, dt + \parallel z \parallel \\ & \leq & \displaystyle \int _0 ^1 \parallel L_{\star } \, ( t \, \parallel z \parallel ) \, \parallel z \parallel \, dt + \parallel z \parallel \\ & \leq & {\bigg(} \displaystyle \tfrac {L_{\star } }{2} \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel +1 {\bigg)} \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel . \end{array} \end{equation}
2.13

Moreover, by (2.11) and (2.13), we obtain that

\[ \parallel dF _{x_0} ^{-1} \, F (x_0) \parallel \leq \parallel dF _{x_0} ^{-1} \, dF _{x^\star } \parallel \, \parallel dF _{x^\star }^{-1} \, F (x_0) \parallel \leq \displaystyle \tfrac {\bigg(2 + L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg) \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel }{2 \, \bigg(1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg)} . \]

That completes the proof of Lemma 2.3.

Remark 2.4

The proof of Lemma 2.3 reduces to [ 56 , Lemma 3.2 ] if \(L_{\star } = L\). Otherwise (i.e., if \( L_{\star } {\lt} L\)) it constitutes an improvement. We have also included the proof although similar to the corresponding one in [ 56 ] because it is not straightforward to see that \(L_{\star }\) can replace \(L\) in the derivation of the crucial estimate (2.9). Note also that (2.10) can holds for some \(L_{\star } ^1 \in (0, L_{\star } ]\). If \(L_{\star } ^1 {\lt} L_{\star } \), then according to the proof of Lemma 2.3, \(L_{\star } ^1\) can replace \(L_{\star } \) in (2.9).

Let us define parameter \(\alpha \) for \(\beta = \displaystyle \tfrac {L _{\star }}{L}\) by

\begin{equation} \label{3.25-nov} \alpha = \left\{ \begin{array}{lll} 4 \, \displaystyle \tfrac {\sqrt{1+3 \, \beta } - (1+\beta )}{\beta \, (1- \beta )}, & {\rm if}& L_{\star } \neq L \\ 1, & {\rm if}& L_{\star } = L . \end{array} \right. \end{equation}
2.14

Then, it is can easily be seen that

\begin{equation} \label{3.26-nov} \alpha \geq 1 . \end{equation}
2.15

We have the local convergence result for Newton’s method.

Theorem 2.5

Let \(G\) be an Abelian group. Let \(0 {\lt} r \leq \displaystyle \tfrac {\alpha }{4 \, L}\), where \(\alpha \) in given in (2.14). Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Suppose there exists \(x^\star \in G\) such that \(F (x^\star )=0\), \(dF_{x^\star }^{-1}\) exists and \(dF_{x^\star }^{-1} \, dF\) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {3 \, r}{1 - L_{\star } \, r} )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \in U (x^\star , r)\) is well defined, remains in \(U (x^\star , \displaystyle \tfrac {3 \, r}{1 - L_{\star } \, r} )\) for all \(n\geq 0\) and converges to a zero \(y^\star \) of mapping \(F\) such that

\[ m( x^\star , y^\star ) {\lt} \tfrac {3 \, r}{1 - L_{\star } \, r} . \]
Proof â–¼
It follows from hypothesis \(x_0 \in U (x^\star , r)\) that there exist \(j \geq 1\) and \(z_1, z_2, \cdots , z_j \in Q\) such that (2.8) holds. If \(dF_{x^\star } ^{-1} \, dF\) is \(L\)-Lipschitz on \(U (x^\star , \displaystyle \tfrac {3 \, r}{1 - L_{\star } \, r} )\), then it is \(L_{\star }\)-center Lipschitz at \(x^\star \) on \(U (x^\star , \displaystyle \tfrac {3 \, r}{1 - L_{\star } \, r} )\). Note also that \(\displaystyle \tfrac {\alpha }{4 \, L} \leq \displaystyle \tfrac {1}{L_{\star }}\) by the choice of \(\alpha \). It follows from Lemma 2.3 that linear mapping \(dF _{x_0}^{-1}\) exists and

\begin{equation} \label{3.23} \eta =\parallel dF _{x_0} ^{-1} \, F(x_0) \parallel \leq \displaystyle \tfrac {\bigg(2 + L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg) \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel } {2 \, \bigg(1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg)} . \end{equation}
2.16

Set

\[ \overline{L} = \displaystyle \tfrac {L} {1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel } \quad {\rm and} \quad \overline{r} = \displaystyle \tfrac {(2 + L_{\star } \, r) \, r} {1- L_{\star } \, r } . \]

We shall show mapping \(dF_{x_0}^{-1} \, dF\) satisfies the \(\overline{L} \)-Lipschitz condition on \(U (x_0 , \overline{r} )\). Indeed, let \(x \in U (x_0 , \overline{r} )\) and \(z \in Q\) be such that \(\parallel z \parallel + m (x_0 , x) {\lt} \overline{r}\). Then, we get that

\[ \parallel z \parallel + m (x , x^\star ) {\lt} \parallel z \parallel + m (x , x_0) + m(x_0 , x^\star ) {\lt} \overline{r} + r \leq \displaystyle \tfrac {3\, r}{1 - L_{\star } \, r } . \]

Using Lemma 2.2, we have that

\[ \begin{array}{lll} \parallel dF _{x_0} ^{-1}\, (dF _{x \cdot {\rm exp} \, z} - dF_x) \parallel & \leq & \parallel dF _{x_0} ^{-1}\, dF_{x^\star } \parallel \, \parallel dF_{x^\star }^{-1} \, (dF _{x \cdot {\rm exp} \, z} - dF_x) \parallel \\ & \leq & \displaystyle \tfrac {L \, \parallel z \parallel } {1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel } = \overline{L } \, \parallel z \parallel . \end{array} \]

Set

\begin{equation} \label{3.24} \overline{h} = \overline{ L} \, \eta \quad {\rm and } \quad \overline{r _1} = \displaystyle \tfrac {2\, \eta } {1+ \sqrt{1- 2 \, \overline{h} }} \leq 2 \, \eta . \end{equation}
2.17

Then, by (2.16), we obtain that

\begin{equation} \label{3.25} \overline{r _1} \leq \displaystyle \tfrac {\bigg(2 + L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg)\, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel } {1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel } \leq \displaystyle \tfrac {(2+ L_{\star } \, r) \, r}{1- L_{\star } \, r} = \overline{r} \end{equation}
2.18

and

\begin{equation} \label{3.26} \overline{h } \leq \displaystyle \tfrac {(2 + L_{\star } \, r )\, {L} \, r } {2 \, (1- L_{\star } \, r )^2 } \leq \displaystyle \tfrac {1}{2} , \end{equation}
2.19

by the choice of \(r\) and \(\alpha \). By standard majorization techniques [ 7 ] , \(\{ x_k \} \) converges to some zero \(y^\star \) of mapping \(F\) and \(m(x_0 , y^\star ) \leq \overline{r_1}\). Furthermore, we obtain that

\[ m (x^\star , y^\star ) \leq m(x^\star , x_0) + m(x_0 , y^\star ) \leq r + \overline{r_1} \leq r + \overline{r} \leq \displaystyle \tfrac {3 \, r}{1- L_{\star } \, r}. \]

That completes the proof of Theorem 2.5.

The proofs of remaining results in this Section are omitted, since they can be obtained from the corresponding ones in [ 56 ] .

Corollary 2.6

Let \(G\) be an Abelian group. Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {1}{L_{\star } } )\). Let also \(x_0 \in U (x^\star , \displaystyle \tfrac {\alpha } {4\, L } )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \) is well defined, remains in \(U (x^\star , \displaystyle \tfrac {\alpha } {4 \, L} )\) for all \(n\geq 0\) and converges to a zero \(y^\star \) of mapping \(F\) such that \( m( x^\star , y^\star ) {\lt} \displaystyle \tfrac {1}{L_{\star } } . \)

We denote in the following Corollary
\[ B (0 ,r) = \{ z \in Q \, : \, \parallel z \parallel {\lt} r \} . \]

Corollary 2.7

Let \(G\) be an Abelian group. Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {1}{L_{\star } } )\). Let also \(x_0 \in U (x^\star , \displaystyle \tfrac {1} {L_{\star } } )\). Let \(\sigma \) be the maximal number such that \(U (e, \sigma ) \subseteq {\rm exp} \, (B (0 , \displaystyle \tfrac {1}{L_{\star }} ))\). Set \( r = \min {\Big\{ } \displaystyle \tfrac {\sigma }{3 + L_{\star } \, \sigma }, \, \displaystyle \tfrac {\alpha }{4 \, L}{\Big\} } \) and \( N ( x^\star , r) = x^\star \, {\rm exp} (B (0, r)). \) Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \in N (x^\star ,r) \) converges to \(x^\star \).

Corollary 2.8

Let \(G\) be an Abelian group. Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping, where \(G\) is a compact connected Lie group equipped with a bi-invariant Riemannian metric. Let \(x^\star \in G\) be a zero of mapping \(F\) and \(0 {\lt} r {\lt} \displaystyle \tfrac {\alpha }{4 \, L}\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {3 \, r}{1- L_{\star } \, r } )\). Let also \(x_0 \in U (x^\star , r )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \), remains in \(\overline{U} (x^\star , \displaystyle \tfrac {3 \, r}{1- L_{\star } \, r } )\) for all \(n \geq 0\) and converges to \(x^\star \).

Corollary 2.9

Let \(G\) be an Abelian group. Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping, where \(G\) is a compact connected Lie group equipped with a bi-invariant Riemannian metric. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {1}{L_{\star } } )\). Let also \(x_0 \in U (x^\star , \displaystyle \tfrac {\alpha }{4 \, L} )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \), remains in \(\overline{U} (x^\star , \displaystyle \tfrac {\alpha }{4 \, L } )\) for all \(n \geq 0\) and converges to \(x^\star \).

Lemma 2.10

Let \(0{\lt} r \leq \displaystyle \tfrac {1}{L}\). Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1}\) exists; there exist \(j\geq 1\) and \(z_1, z_2, \cdots , z_j \in Q\) such that \( x_0 = x^\star \cdot {\rm exp} \, z_1 \cdots {\rm exp} \, z_j \) for \( \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel {\lt} r \) and \(dF_{x^\star }^{-1} \, dF\) satisfies the \(L\)-Lipschitz condition at \( x^\star \) on \(U (x^\star , r )\). Then, the following assertions hold

  1. \[ \parallel dF_{x^\star }^{-1} \, (dF_{x^\star \cdot {\rm exp} \, z }- dF_{x^\star } ) \parallel \leq K_{\star } \, \parallel z \parallel \]

    for each \(z\in Q\) with \(\parallel z \parallel {\lt} r\) and some \(K_{\star } \in (0, L]\).

  2. \[ \parallel dF_{x^\star }^{-1} \, (F'(x_0) - F'(x^\star ) ) \parallel \leq L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z _i \parallel \quad for \, \, some \quad L_\star \in (0, L]. \]
  3. Linear mapping \(dF_{x_0}^{-1}\) exists and

    \[ \parallel dF _{x_0} ^{-1} \, F(x_0) \parallel \leq \displaystyle \tfrac {\bigg(2 + L \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg) \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel }{2 \, \bigg(1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg)} . \]
Proof â–¼
  1. This assertion follows from the hypothesis that \(dF_{x^\star }^{-1} \, dF\) satisfies the \(L\)-Lipschitz condition at \( x^\star \) on \(U (x^\star , r )\).

  2. Let \(w_0= x^\star \), \(w_{i+1} = w_i \cdot {\rm exp} \, z_{i+1}\), \(i=1,2, \cdots , j-1\). Then, we have \(w_j = w_{j-1} \cdot {\rm exp} \, z_j= x_0\). Using the \(L\)-Lipschitz condition, we get in turn that

    \[ \begin{array}{l} \parallel dF_{x^\star } ^{-1} \, (dF_{w_j} - dF_{x^\star } ) \parallel \\ \leq \parallel dF_{x^\star } ^{-1} \, (dF_{w_{j-1} \cdot {\rm exp} \, z_j} - dF_{w_{j-1}} ) \parallel + \cdots + \parallel dF_{x^\star } ^{-1} \, (dF_{w_{1} \cdot {\rm exp} \, z_2} - dF_{w_{1}} ) \parallel + \\ \parallel dF_{x^\star } ^{-1} \, (dF_{x^\star \cdot {\rm exp} \, z_1} - dF_{x^\star } ) \parallel \\ \leq L (\parallel z_j \parallel + \cdots + \parallel z_2 \parallel ) + K_{\star } \, \parallel z_1 \parallel \\ = L \, (\parallel z_j \parallel + \cdots + \parallel z_1 \parallel ) + (K_{\star } - L) \, \parallel z_1 \parallel \leq L \, (\parallel z_j \parallel + \cdots + \parallel z_1 \parallel ) \end{array} \]

    which shows (ii).

  3. We have by (ii) that

    \[ \parallel dF_{x^\star } ^{-1} \, (dF_{x_0} - dF_{x^\star } ) \parallel \leq L _{\star }\, (\parallel z_j \parallel + \cdots + \parallel z_1 \parallel ) \leq L\, r {\lt} 1 . \]

    Hence, \(dF_{x_0}^{-1}\) exists and

    \begin{equation} \label{3.31} \parallel dF _{x_0} ^{-1} \, dF_{x^\star } \parallel \leq \bigg(1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg) ^{-1} . \end{equation}
    2.20

    We have that

    \[ \begin{array}{l} dF _{x^\star } ^{-1} \, (F(w_i) - F(w_{i-1})) = dF _{x^\star } ^{-1} \, \displaystyle \int _0 ^1 dF _{w_{i-1} \cdot {\rm exp} \, (t\, z_i)} \, z_i \, dt \\ = \displaystyle \int _0 ^1 dF _{x^\star } ^{-1} \, (dF _{w_{i-1} \cdot {\rm exp} \, (t\, z_i) } - dF_{x^\star } ) \, z _i \, dt + z_i. \end{array} \]

    Hence, we get that

    \[ \parallel dF _{x^\star } ^{-1} \, (dF _{w_{k-1} \cdot {\rm exp} \, (z_k) } \ - dF_{w_{k-1} } ) \parallel \leq L\, \parallel z_k \parallel \quad {\rm for \, \, each} \quad k=1,2, \cdots , j . \]

    Therefore, we obtain that

    \[ \begin{array}{lll} \| dF _{x^\star } ^{-1} \, (dF _{w_{i-1} \cdot {\rm exp} \, (t\, z_i) } - dF_{x^\star } ) \| \leq \\ \leq \displaystyle \textstyle \sum \limits _{k=1}^{i-1} \| dF _{x^\star } ^{-1} (dF _{w_{k-1} \cdot {\rm exp} ( z_k) } - dF_{ w_{k-1} } ) \| + \| dF _{x^\star } ^{-1} (dF _{w_{i-1} \cdot {\rm exp} (t w_i) } - dF_{ w_{i-1} } ) \| \\ \leq L \displaystyle \textstyle \sum \limits _{k=1}^{i-1} \| z_k \| + L t \| z_i \| . \end{array} \]

    We have that \(F(x_0) =\displaystyle \textstyle \sum \limits _{i=1}^{j} (F(w_i) - F(w_{i-1} )) \). That is, we can get

    \begin{equation} \label{3.32} \begin{array}{l} \parallel dF _{x^\star } ^{-1} \, F(x_0) \parallel \\ \leq \displaystyle \textstyle \sum \limits _{i=1}^{j} {\bigg(} \displaystyle \int _0 ^1 \parallel dF _{x^\star } ^{-1} \, (dF _{w_{i-1} \cdot {\rm exp} \, (t\, z_i) } - dF_{ x^\star } ) \parallel \, \parallel z_i \parallel \, dt + \parallel z_i \parallel {\bigg)} \\ \leq \displaystyle \textstyle \sum \limits _{i=1}^{j} {\bigg(} \displaystyle \int _0 ^1 L\, {\bigg(} \displaystyle \textstyle \sum \limits _{k=1}^{i-1} \parallel z_k \parallel + t\, \parallel z_i \parallel {\bigg)} \, \parallel z_i \parallel \, dt + \parallel z_i \parallel {\bigg)} \\ \leq {\bigg(} \displaystyle \tfrac {L}{2} \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel +1 {\bigg)} \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel . \end{array} \end{equation}
    2.21

    The result now follows from (2.20), (2.21) and

    \[ \parallel dF _{x_0} ^{-1} \, F (x_0) \parallel \leq \parallel dF _{x_0} ^{-1} \, dF _{x^\star } \parallel \, \parallel dF _{x^\star }^{-1} \, F (x_0) \parallel \leq \displaystyle \tfrac {\bigg(2 + L \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg) \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel }{2 \, \bigg(1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg)} . \]

The proof of Lemma 2.10 is complete.

Let us define parameter \(\delta \) by

\begin{equation} \label{3.33} \delta = \left\{ \begin{array}{lll} 4 \, \displaystyle \tfrac {\sqrt{2\, (1 + \beta )} - (1+\beta )}{1- \beta ^2}, & {\rm if}& L_{\star } \neq L \\ 1, & {\rm if}& L_{\star } = L . \end{array} \right. \end{equation}
2.22

We have that

\[ \delta \geq 1 . \]

Theorem 2.11

Let \(0 {\lt} r \leq \displaystyle \tfrac {\delta }{4 \, L}\), where \(\delta \) in given in (2.22). Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Suppose there exists \(x^\star \in G\) such that \(F (x^\star )=0\), \(dF_{x^\star }^{-1}\) exists and \(dF_{x^\star }^{-1} \, dF\) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , R= \displaystyle \tfrac {3 + (L- L_{\star } ) \, r}{1 - L_{\star } \, r} \, r )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \in U (x^\star , r)\) is well defined, remains in \(U (x^\star ,R )\) for all \(n\geq 0\) and converges to a zero \(y^\star \) of mapping \(F\) such that \(m( x^\star , y^\star ) {\lt} R\).

Proof â–¼
The proof is similar to the proof of Theorem 2.5. Note that \(\overline{r}\) in the proof of Theorem 2.5 is replaced by
\[ \overline{\overline{r}} = \displaystyle \tfrac {(2+L\, r) \, r}{1- L_{\star } \, r} \]

and \(\eta \) satisfies the new condition

\[ \eta \leq \displaystyle \tfrac {\bigg(2 + L \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg) \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel }{2 \, \bigg(1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg)} . \]

The proof of Theorem 2.11 is complete.

Corollary 2.12

Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {1}{L_{\star } } )\). Let also \(x_0 \in U (x^\star , \displaystyle \tfrac {\delta } {4\, L } )\). Then, the conclusions of Corollary 2.6 hold.

Corollary 2.13

Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {1}{L_{\star } } )\). Let also \(x_0 \in U (x^\star , \displaystyle \tfrac {1} {L_{\star } } )\). Let \(\sigma \) and \(N(x^\star , r)\) as in Corollary 2.7. Set \( r = \min {\Big\{ } r_0 , \, r_1 , \, \displaystyle \tfrac {\delta }{4 \, L} {\Big\} } \), where

\[ r_0 = \left\{ \begin{array}{lll} \displaystyle \tfrac {-(L_{\star } + 3 \, L ) + \sqrt{ (L_{\star }+3\, L)^2 + 4\, L \, (L-L_{\star })} } { 2 \, L \, (L- L_{\star } ) }, & {\rm if}& L_{\star } \neq L \\ \displaystyle \tfrac {1}{4\, L}, & {\rm if}& L_{\star } = L \end{array} \right. \]

and

\[ r_1 = \left\{ \begin{array}{lll} \displaystyle \tfrac {-(3 + \sigma \, L_{\star } ) + \sqrt{ (3+\sigma \, L_{\star })^2 + 4\, \sigma \, (L-L_{\star })} } { 2 \, (L- L_{\star } ) }, & {\rm if}& L_{\star } \neq L \\ \displaystyle \tfrac {\sigma }{3+ \sigma \, L}, & {\rm if}& L_{\star } = L . \end{array} \right. \]

Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \in N (x^\star ,r) \) converges to \(x^\star \).

Corollary 2.14

Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping, where \(G\) is a compact connected Lie group equipped with a bi-invariant Riemannian metric. Let \(x^\star \in G\) be a zero of mapping \(F\) and \(0 {\lt} r {\lt} \displaystyle \tfrac {\delta }{4 \, L}\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , R )\), where \(R \) is defined in Theorem 2.11. Let also \(x_0 \in U (x^\star , r )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \), remains in \(\overline{U} (x^\star , R )\) for all \(n \geq 0\) and converges to \(x^\star \).

Corollary 2.15

Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping, where \(G\) is a compact connected Lie group equipped with a bi-invariant Riemannian metric. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {1}{L_{\star } } )\). Let also \(x_0 \in U (x^\star , \displaystyle \tfrac {\delta }{4 \, L} )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \), remains in \(\overline{U} (x^\star , \displaystyle \tfrac {\delta }{4 \, L } )\) for all \(n \geq 0\) and converges to \(x^\star \).

Remark 2.16

(a) The local results reduce to the corresponding ones in [ 56 ] if \(L = L_{\star }\). Otherwise (i.e., if \(L_{\star } {\lt} L\)), they constitute an improvement under the same computational cost with advantages as already stated in the Introduction of this study. Note also that \(\alpha {\gt}1\), \(\delta {\gt}1\) if \(L_{\star } {\lt}L\) and \(\alpha \longrightarrow \infty \), \(\delta \longrightarrow \infty \) if \(\displaystyle \tfrac {L}{L_{\star }}\longrightarrow \infty \).

(b) The local results if \(G\) is an Abelian group are weaker and tighter than the ones when \(G\) is not necessarily an Abelian group. We have for example that \(\overline{r} {\lt} \overline{\overline{r}} \), \(\delta {\lt} \alpha \), \(\displaystyle \tfrac {3 \, r}{1- L_{\star } \, r} {\lt} R\) and the upper bound on \(\eta \) is smaller if \(L_{\star } {\lt} L\).

(c) It is obvious that finer results can be immediately obtained if similar conditions as the semilocal case (see [ 15 , Section 1, L instead L0 ] ) are used instead of Kantorovich condition for \(L_{\star } {\lt} L\). However, we decided to leave this part of analysis to the motivated reader. We refer the reader to [ 14 ] for such results involving nonlinear equations in a Banach space setting. We also refer the reader to [ 7 , 11 , 15 , 16 ] for examples.

3 Numerical examples

In this Section we present two numerical examples in the more general setting of a nonlinear equation on a Hilbert space where \(L_\star {\lt} L\).

Example 3.1

Let \(\mathcal X= \mathcal Y= \mathbb {R}\), \(x^\star =0\). Define \(F\) by \(F(x)=-d_2 \sin 1 + d_1 \, x + d_2 \, \sin {\rm e}^{d_3 \, x} \), where \(d_1\), \(d_2\), \(d_3\) are given real numbers. We have that \(F(x^\star ) =0\). Moreover, if \(d_3\) is sufficiently large and \(d_2\) sufficiently small, \(L/ L_\star \) can be arbitrarily large.

Example 3.2

Let \(\mathcal X= \mathcal Y= \mathbb {R} ^3\), \(\mathcal D= \overline{U} (0,1)\) and \(x^\star = (0,0,0)\). Define function \(F\) on \(\mathcal D\) for \(w= (x,y,z)\) by

\begin{equation} \label{3.1-e} F(w)= ( {\rm e}^x - 1 , \displaystyle \tfrac {{\rm e}-1}{2} \, y^2 +y , z) . \end{equation}
3.23

Then, the Fréchet derivative of \(F\) is given by

\[ F'(w)= \begin{pmatrix} {\rm e}^x & 0 & 0 \\ 0 & ({\rm e}-1) \, y + 1 & 0 \\ 0 & 0 & 1 \end{pmatrix} \]

Notice that we have \(F(x^\star ) = 0\), \(F'( x^\star ) = F ' (x^\star ) ^{-1} = {\rm diag } \, \{ 1 , 1 , 1\} \) and \(L_{\star } = {\rm e}-1 {\lt} L= {\rm e}\).

Bibliography

1

Absil, P.–A., Baker, C.G. and Gallivan, K.A., Trust-region methods on Riemannian manifolds, Found. Comput. Math., 7 (2007), pp. 303–330. \includegraphics[scale=0.1]{ext-link.png}

2

Adler, R.L., Dedieu, J.P., Margulies, J.Y., Martens, M. and Shub, M., Newton’s method on Riemannian manifolds and a geometric model for the human spine, IMA J. Numer. Anal., 22 (2002), pp. 359–390. \includegraphics[scale=0.1]{ext-link.png}

3

Alvarez, F., Bolte, J. and Munier, J., A unifying local convergence result for Newton’s method in Riemannian manifolds, Found. Comput. Math., 8 (2008), pp. 197–226. \includegraphics[scale=0.1]{ext-link.png}

4

Argyros, I.K., An improved unifying convergence analysis of Newton’s method in Riemannian manifolds, J. Appl. Math. Comput., 25 (2007), pp. 345–351. \includegraphics[scale=0.1]{ext-link.png}

5

Argyros, I.K., On the local convergence of Newton’s method on Lie groups, Pan Amer. Math. J., 17 (2007), pp. 101–109.

6

Argyros, I.K., A Kantorovich analysis of Newton’s method on Lie groups, J. Concrete Appl. Anal., 6 (2008), pp. 21–32.

7

Argyros, I.K., Convergence and Applications of Newton-type Iterations, Springer Verlag Publ., New York, 2008. \includegraphics[scale=0.1]{ext-link.png}

8

Argyros, I.K., Newton’s method in Riemannian manifolds, Rev. Anal. Numér. Théor. Approx., 37 (2008), pp. 119–125. \includegraphics[scale=0.1]{ext-link.png}

9

Argyros, I.K., Newton’s method on Lie groups, J. Appl. Math. Comput., 31 (2009), pp. 217–228.

10

Argyros, I.K., A semilocal convergence analysis for directional Newton methods, Math. Comp. AMS, 80 (2011), pp. 327–343. \includegraphics[scale=0.1]{ext-link.png}

11

Argyros, I.K., Cho, Y.J. and Hilout, S., Numerical methods for equations and its applications, CRC Press/Taylor and Francis Group, New-York, 2012.

12

Argyros, I.K. and Hilout, S., Newton’s method for approximating zeros of vector fields on Riemannian manifolds, J. Appl. Math. Comput., 29 (2009), pp. 417–427. \includegraphics[scale=0.1]{ext-link.png}

13

Argyros, I.K. and Hilout, S., Extending the Newton-Kantorovich hypothesis for solving equations, J. Comput. Appl. Math., 234 (2010), pp. 2993–3006. \includegraphics[scale=0.1]{ext-link.png}

14

Argyros, I.K. and Hilout, S., Weaker conditions for the convergence of Newton’s method, J. Complexity, 28 (2012), pp. 364–387. \includegraphics[scale=0.1]{ext-link.png}

15

Argyros, I.K. and Hilout, S., Extending the applicability of Newton’s method on Lie groups, Appl. Math. Comput., to appear.

16

Argyros, I.K. and Hilout, S., Numerical methods in nonlinear analysis, World Scientific Publ., ISBN-13: 978-9814405829, to appear.

17

Bayer, D.A. and Lagarias, J.C., The nonlinear geometry of linear programming. I. Affine and projective scaling trajectories, Trans. Amer. Math. Soc., 314 (1989), pp. 499–526. \includegraphics[scale=0.1]{ext-link.png}

18

Brockett, R.W., Differential geometry and the design of gradient algorithms. Differential geometry: partial differential equations on manifolds (Los Angeles, CA, 1990), pp. 69–92, Proc. Sympos. Pure Math., 54, Part 1, Amer. Math. Soc., Providence, RI, 1993.

19

Chu, M.T. and Driessel, K.R., The projected gradient method for least squares matrix approximations with spectral constraints, SIAM J. Numer. Anal., 27 (1990), pp. 1050–1060. \includegraphics[scale=0.1]{ext-link.png}

20

Dedieu, J.P., Priouret, P. and Malajovich, G., Newton’s method on Riemannian manifolds: Convariant \(\alpha \) theory, IMA J. Numer. Anal., 23 (2003), pp. 395–419. \includegraphics[scale=0.1]{ext-link.png}

21

Do Carmo, M.P., Riemannian Geometry, Boston, Birkhauser, 1992.

22

Edelman, A., Arias, T.A. and Smith, S.T., The geometry of algorithms with orthogonality constraints, SIAM J. Matrix Anal. Appl., 20 (1999), pp. 303–353. \includegraphics[scale=0.1]{ext-link.png}

23

Ezquerro, J.A. and Hernández, M.A., Generalized differentiability conditions for Newton’s method, IMA J. Numer. Anal., 22 (2002), pp. 187–205. \includegraphics[scale=0.1]{ext-link.png}

24

Ezquerro, J.A. and Hernández, M.A., On an application of Newton’s method to nonlinear operators with \(\omega \)-conditioned second derivative, BIT, 42 (2002), pp. 519–530.

25

Ferreira, O.P. and Svaiter, B.F., Kantorovich’s theorem on Newton’s method in Reimannian manifolds, J. Complexity, 18 (2002), pp. 304–329. \includegraphics[scale=0.1]{ext-link.png}

26

Gabay, D., Minimizing a differentiable function over a differential manifold, J. Optim. Theory Appl., 37 (1982), pp. 177–219. \includegraphics[scale=0.1]{ext-link.png}

27

Gutiérrez, J.M. and Hernández, M.A., Newton’s method under weak Kantorovich conditions, IMA J. Numer. Anal., 20 (2000), pp. 521–532. \includegraphics[scale=0.1]{ext-link.png}

28

Hall, B.C., Lie groups, Lie algebras and representations. An elementary introduction, Graduate Texts in Mathematics, 222, Springer-Verlag, New York, 2003.

29

Helgason, S., Differential geometry, Lie groups and symmetric spaces, Pure and Applied Mathematics, 80, Academic Press, Inc. Harcourt Brace Jovanovich, Publishers, New York-London, 1978.

30

Helmke, U. and Moore, J.B., Singular-value decomposition via gradient and self-equivalent flows, Linear Algebra Appl., 169 (1992), pp. 223–248. \includegraphics[scale=0.1]{ext-link.png}

31

Helmke, U. and Moore, J.B., Optimization and dynamical systems. With a foreword by R. Brockett, Communications and Control Engineering Series. Springer-Verlag London, Ltd., London, 1994.

32

Kantorovich, L.V. and Akilov, G.P., Functional analysis in normed spaces, Pergamon Press, New York, 1982.

33

Li, C., López, G. and Martín-Márquez, V., Monotone vector fields and the proximal point algorithm on Hadamard manifolds, J. Lond. Math. Soc., (2) 79 (2009), pp. 663–683. \includegraphics[scale=0.1]{ext-link.png}

34

Li, C. and Wang, J.H., Convergence of the Newton method and uniqueness of zeros of vector fields on Riemannian manifolds, Sci. China Ser. A, 48 (2005), pp. 1465–1478. \includegraphics[scale=0.1]{ext-link.png}

35

Li, C. and Wang, J.H., Newton’s method on Riemannian manifolds: Smale’s point estimate theory under the \(\gamma \)-condition, IMA J. Numer. Anal., 26 (2006), pp. 228–251. \includegraphics[scale=0.1]{ext-link.png}

36

Li, C., Wang, J.H. and Dedieu, J.P., Smale’s point estimate theory for Newton’s method on Lie groups, J. Complexity, 25 (2009), pp. 128–151. \includegraphics[scale=0.1]{ext-link.png}

37

Proinov, P.D., General local convergence theory for a class of iterative processes and its applications to Newton’s method, J. Complexity, 25 (2009), pp. 38–62. \includegraphics[scale=0.1]{ext-link.png}

38

Luenberger, D.G., The gradient projection method along geodesics, Management Sci., 18 (1972), pp. 620–631.

39

Mahony, R.E., Optimization algorithms on homogeneous spaces: with applications in linear systems theory, Ph.D. Thesis, Department of Systems Engineering, University of Canberra, Australia, 1994.

40

Mahony, R.E., The constrained Newton method on a Lie group and the symmetric eigenvalue problem, Linear Algebra Appl., 248 (1996), pp. 67–89. \includegraphics[scale=0.1]{ext-link.png}

41

Mahony, R.E., Helmke, U. and Moore, J.B., Pole placement algorithms for symmetric realisations, Proceedings of 32nd IEEE Conference on Decision and Control, San Antonio, TX, 1993, pp. 355–1358.

42

Moore, J.B., Mahony, R.E. and Helmke, U., Numerical gradient algorithms for eigenvalue and singular value calculations, SIAM J. Matrix Anal. Appl., 15 (1994), pp. 881–902. \includegraphics[scale=0.1]{ext-link.png}

43

Owren, B. and Welfert, B., The Newton iteration on Lie groups, BIT, 40 (2000), pp. 121–145. \includegraphics[scale=0.1]{ext-link.png}

44

Proinov, P.D., New general convergence theory for iterative processes and its applications to Newton-Kantorovich type theorems, J. Complexity, 26 (2010), pp. 3–42. \includegraphics[scale=0.1]{ext-link.png}

45

Rheinboldt, W.C., An adaptive continuation process for solving system of nonlinear equations, Banach Center Publ., 3 (1977), pp. 129–142.

46

Shub, M. and Smale, S., Complexity of Bézout’s theorem. I. Geometric aspects, J. Amer. Math. Soc., 6 (1993), pp. 459–501. \includegraphics[scale=0.1]{ext-link.png}

47

Smale, S., Newton’s method estimates from data at a point, The merging of disciplines: New directions in Pure, Applied and Computational Mathematics, R. Ewing, K. Gross and C. Martin eds, New-York, Springer Verlag Publ., (1986), pp. 185–196.

48

Smith, S.T., Dynamical systems that perform the singular value decomposition, Systems Control Lett., 16 (1991), pp. 319–327. \includegraphics[scale=0.1]{ext-link.png}

49

Smith, S.T., Smith, S.T., Geometric optimization methods for adaptive filtering, Thesis (Ph.D.)–Harvard University, ProQuest LLC, Ann Arbor, MI, 1993.

50

Smith, S.T., Smith, S.T., Optimization techniques on Riemannian manifolds, Hamiltonian and gradient flows, algorithms and control, Fields Inst. Commun., 3, Amer. Math. Soc., Providence, RI, 1994, pp. 113–136.

51

Traub, J.F. and Wozniakowski, H., Convergence and complexity of Newton iteration for operator equations, J. Assoc. Comput. Mach., 26 (1979), pp. 250–258. \includegraphics[scale=0.1]{ext-link.png}

52

Udrişte, C., Convex functions and optimization methods on Riemannian manifolds, Mathematics and its Applications, 297, Kluwer Academic Publishers Group, Dordrecht, 1994.

53

Varadarajan, V.S., Lie groups, Lie algebras and their representations, Reprint of the 1974 edition. Graduate Texts in Mathematics, 102, Springer-Verlag, New York, 1984.

54

Wang, J.H. and Li, C., Uniqueness of the singular points of vector fields on Riemannian manifolds under the \(\gamma \)-condition, J. Complexity, 22 (2006), pp. 533–548. \includegraphics[scale=0.1]{ext-link.png}

55

Wang, J.H. and Li, C., Kantorovich’s theorem for Newton’s method on Lie groups, J. Zheijiang Univ. Sci. A, 8 (2007), pp. 978–986. \includegraphics[scale=0.1]{ext-link.png}

56

Wang, J.H. and Li, C., Kantorovich’s theorems for Newton’s method for mappings and optimization problems on Lie groups, IMA J. Numer. Anal., 31 (2011), pp. 322–347. \includegraphics[scale=0.1]{ext-link.png}

57

Wang, X.H., Convergence of Newton’s method and uniqueness of the solution of equations in Banach space, IMA J. Numer. Anal., 20 (2000), pp. 123–134. \includegraphics[scale=0.1]{ext-link.png}

58

Zabrejko, P.P. and Nguen, D.F., The majorant method in the theory of Newton-Kantorovich approximations and the Pták error estimates, Numer. Funct. Anal. Optim., 9 (1987), pp. 671–684. \includegraphics[scale=0.1]{ext-link.png}