Enlarging the convergence ball
of Newton’s method on Lie groups
February 16, 2013.
\(^\ast \)Department of Mathematical Sciences, Cameron University, Lawton, OK 73505, USA, e-mail: ioannisa@cameron.edu.
\(^\S \)Poitiers University, Laboratoire de Mathématiques et Applications, Bd. Pierre et Marie Curie, Téléport 2, B.P. 30179, 86962 Futuroscope Chasseneuil Cedex, France, e-mail: said.hilout@math.univ-poitiers.fr.
Dedicated to prof. I. Păvăloiu on the occasion of his 75th anniversary
We present a local convergence analysis of Newton’s method for approximating a zero of a mapping from a Lie group into its Lie algebra. Using more precise estimates than before [ 55 , 56 ] and under the same computational cost, we obtain a larger convergence ball and more precise error bounds on the distances involved. Some examples are presented to further validate the theoretical results.
MSC. 65G99, 65J15, 47H17, 49M15, 90C90.
Keywords. Newton’s method, Lie group, Lie algebra, Riemannian manifold, convergence ball, Kantorovich hypothesis.
1 Introduction
In this paper, we are concerned with the problem of approximating a zero \( x^\star \) of \(\mathcal{C}^1\)-mapping \(F \, : \, G \longrightarrow Q\), where \(G\) is a Lie group and \(Q\) the Lie algebra of G that is the tangent space \(T_e \, G\) of \(G\) at \(e\), equipped with the Lie bracket \([.,.] \, : \, Q\times Q \longrightarrow Q\) [ 7 , 21 , 28 , 29 , 53 ] .
The study of numerical algorithms on manifolds for solving eigenvalue or optimization problems on Lie groups [ 1 ] – [ 12 ] , [ 33 ] – [ 36 ] , [ 54 ] – [ 57 ] is very important in Computational Mathematics. Newton-type methods are the most popular iterative procedures used to solve equations, when these equations contain differentiable operators. The study about convergence matter of Newton’s method is usually centered on two types: semilocal and local convergence analysis. The semilocal convergence matter is, based on the information around an initial point, to give criteria ensuring the convergence of Newton’s method; while the local one is, based on the information around a solution, to find estimates of the radii of convergence balls. There is a plethora of studies on the weakness and/or extension of the hypothesis made on the underlying operators; see for example (cf. [ 7 , 11 , 16 ] and references theiren). A local as well as a semilocal convergence of Newton-type methods has been given by several authors under various conditions [ 2 ] – [ 58 ] . Recently in [ 15 ] , we presented a finer semilocal convergence analysis for Newton’s method than in earlier studies [ 7 , 32 , 34 , 36 , 54 , 55 , 56 ] .
Newton’s method with initial point \(x_0 \in G\) was first introduced by Owren and Welfert [ 43 ] in the form
Newton’s method is undoubtedly the most popular method for generating a sequence \(\{ x_n \} \) approximating \(x^\star \) [ 7 , 11 , 15 , 16 , 32 , 34 , 36 ] , [ 54 ] – [ 56 ] . In the present paper we establish a finer local convergence analysis with the advantages (\(\mathcal{A}_l\)):
Larger convergence ball;
and
Tighter error bounds on the distances involved.
The necessary background on Lie groups can be found in [ 7 , 11 , 15 , 16 ] and the references therein. The paper is organized as follows. In Section 2, we present the local convergence analysis of Newton’s method. Finally, numerical examples are given in the concluding Section 3.
2 Local convergence analysis for Newton’s method
We shall study the semilocal/local convergence of Newton’s method. In the rest of the paper we assume \(\langle \cdot , \cdot \rangle \) the inner product and \(\parallel \cdot \parallel \) on \(Q\). As in [ 56 ] we define a distance on \(G\) for \(x, y \in G\) as follows:
By convention \(\inf \, \emptyset = +\infty \). It is easy to see that \(m(\cdot , \cdot )\) is a distance on \(G\) and the topology induced is equivalent to the original one on \(G\). Let \(w \in G\) and \(r {\gt}0\), we denote by
the open ball centered at \(w\) and of radius \(r\). Moreover, we denote the closure of \(U(w,r)\) by \(\overline{U} (w, r) \). Let also \(L(Q)\) denotes the set of all linear operators on \(Q\).
We need the definition of Lipschitz continuity for a mapping.
Let \(M \, : \, G \longrightarrow L (Q)\), \(x^\star \in G\) and \(r{\gt}0\). We say that \(M\) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , r)\) if
holds for any \(u \in Q\) and \(x \in U (x^\star ,r)\) such that \(\parallel u \parallel + m (x , x^\star ) {\lt} r\) .
holds for any \(u\in Q\) and \(x \in U (x^\star ,r)\) such that \(\parallel u\parallel + m(x, x^\star ) {\lt} r\).
We then say \(M\) satisfies the \(L_{\star }\)-center Lipschitz condition at \(x^\star \in G\) on \(U(x^\star ,r)\). Note that if \(M\) satisfies the \(L\)-Lipschitz condition on \(U(x^\star ,r)\), then it also satisfies the \(L _{\star }\)-center Lipschitz condition at \(x^\star \in G\) on \(U (x^\star ,r)\). Clearly,
holds in general and \({L}/{L_{\star }}\) can be arbitrarily large [ 4 ] – [ 16 ] .
Let us show that (2.3) indeed implies (2.4). If (2.3) holds, then for \(x=x^\star \) there exists \(K \in (0,L]\) such that
There exist \(k\geq 1\) and \(z_0, z_1, \cdots , z_k \in Q\) such that \(x=x^\star \cdot \exp \, z_0 \cdots \exp \, z_k \). Let \(y_0=x^\star \), \(y_{i+1} = y_i \cdot \exp \, u_i\), \(i=0,1, \cdots , k\). Then, we have that \(x=y_{k+1} \). We can write the identity
Using (2.3) and the triangle inequality, we obtain in turn that
which implies (2.4).
We need a Banach type lemma on invertible mappings.
Suppose that \(dF_{x^\star }^{-1}\) exists and let \(0 {\lt} r \leq \tfrac {1}{L_{\star }}\). Suppose \(dF_{x^\star }^{-1} \, dF\) satisfies the \(L_{\star }\)-center Lipschitz condition at \(x^\star \in G\) on \(U (x^\star ,r)\); for \(x \in U (x^\star ,r)\), there exist \(k \geq 1\) and \(z_0, z_1, \cdots , z_k \in Q\), such that \(x= x^\star \cdot {\rm exp} \, z_0 \cdots {\rm exp} \, z_k\) and \(\textstyle \sum \limits _{i=0}^{k} \parallel z_i \parallel {\lt} r\). Then, linear mapping \(dF_{x}^{-1}\) exists and
It follows from (2.7) and the Banach lemma on invertible operators [ 7 ] , [ 32 ] that \(dF_x^{-1}\) exists and (2.6) holds. That completes the proof of Lemma 2.2.
Next, we study the convergence domain of Newton’s method around a zero \(x^\star \) of mapping \(F\). First from Lemma 2.3 until Corollary 2.9, we present the local result for Newton’s method when \(G\) is an Abelian group. Then, the corresponding local results follow when \(G\) is not necessarily an Abelian group.
Let \(G\) be an Abelian group and \(0{\lt} r \leq \tfrac {1}{L_{\star }}\). Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1}\) exists and let \( L_{\star }{\gt}0\); there exist \(j \geq 1\) and \(z_1, z_2, \cdots , z_j \in Q\) such that
\(dF_{x^\star }^{-1} \, dF\) satisfies the \(L_{\star }\)-center Lipschitz condition at \( x^\star \) on \(U (x^\star ,r )\). Then, linear mapping \(dF _{x_0} ^{-1}\) exists and
for all \(u \in Q\), \( x \in U (x^\star , r)\) such that \(\parallel u \parallel + m(x, x^\star ) {\lt} r\). Let \(z=z_1+z_2+ \cdots + z_j\). Then, since \(G\) is an Abelian group, \({\rm exp} \, z_1 \cdot {\rm exp} \, z_2 \cdots {\rm exp} \, z_j = {\rm exp} \, (z_1+z_2+ \cdots +z_j) = {\rm exp} \, z\), so, we can write \(x_0 = x^\star \cdot {\rm exp} \, z\). It then follows from Lemma 2.2 that \(dF_{x_0}^{-1}\) exists and
We also have the following identity
In view of (2.10) and (2.12), we get that
Moreover, by (2.11) and (2.13), we obtain that
That completes the proof of Lemma 2.3.
The proof of Lemma 2.3 reduces to [ 56 , Lemma 3.2 ] if \(L_{\star } = L\). Otherwise (i.e., if \( L_{\star } {\lt} L\)) it constitutes an improvement. We have also included the proof although similar to the corresponding one in [ 56 ] because it is not straightforward to see that \(L_{\star }\) can replace \(L\) in the derivation of the crucial estimate (2.9). Note also that (2.10) can holds for some \(L_{\star } ^1 \in (0, L_{\star } ]\). If \(L_{\star } ^1 {\lt} L_{\star } \), then according to the proof of Lemma 2.3, \(L_{\star } ^1\) can replace \(L_{\star } \) in (2.9).
Then, it is can easily be seen that
We have the local convergence result for Newton’s method.
Let \(G\) be an Abelian group. Let \(0 {\lt} r \leq \displaystyle \tfrac {\alpha }{4 \, L}\), where \(\alpha \) in given in (2.14). Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Suppose there exists \(x^\star \in G\) such that \(F (x^\star )=0\), \(dF_{x^\star }^{-1}\) exists and \(dF_{x^\star }^{-1} \, dF\) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {3 \, r}{1 - L_{\star } \, r} )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \in U (x^\star , r)\) is well defined, remains in \(U (x^\star , \displaystyle \tfrac {3 \, r}{1 - L_{\star } \, r} )\) for all \(n\geq 0\) and converges to a zero \(y^\star \) of mapping \(F\) such that
Set
We shall show mapping \(dF_{x_0}^{-1} \, dF\) satisfies the \(\overline{L} \)-Lipschitz condition on \(U (x_0 , \overline{r} )\). Indeed, let \(x \in U (x_0 , \overline{r} )\) and \(z \in Q\) be such that \(\parallel z \parallel + m (x_0 , x) {\lt} \overline{r}\). Then, we get that
Using Lemma 2.2, we have that
Set
Then, by (2.16), we obtain that
and
by the choice of \(r\) and \(\alpha \). By standard majorization techniques [ 7 ] , \(\{ x_k \} \) converges to some zero \(y^\star \) of mapping \(F\) and \(m(x_0 , y^\star ) \leq \overline{r_1}\). Furthermore, we obtain that
That completes the proof of Theorem 2.5.
The proofs of remaining results in this Section are omitted, since they can be obtained from the corresponding ones in [ 56 ] .
Let \(G\) be an Abelian group. Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {1}{L_{\star } } )\). Let also \(x_0 \in U (x^\star , \displaystyle \tfrac {\alpha } {4\, L } )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \) is well defined, remains in \(U (x^\star , \displaystyle \tfrac {\alpha } {4 \, L} )\) for all \(n\geq 0\) and converges to a zero \(y^\star \) of mapping \(F\) such that \( m( x^\star , y^\star ) {\lt} \displaystyle \tfrac {1}{L_{\star } } . \)
Let \(G\) be an Abelian group. Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {1}{L_{\star } } )\). Let also \(x_0 \in U (x^\star , \displaystyle \tfrac {1} {L_{\star } } )\). Let \(\sigma \) be the maximal number such that \(U (e, \sigma ) \subseteq {\rm exp} \, (B (0 , \displaystyle \tfrac {1}{L_{\star }} ))\). Set \( r = \min {\Big\{ } \displaystyle \tfrac {\sigma }{3 + L_{\star } \, \sigma }, \, \displaystyle \tfrac {\alpha }{4 \, L}{\Big\} } \) and \( N ( x^\star , r) = x^\star \, {\rm exp} (B (0, r)). \) Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \in N (x^\star ,r) \) converges to \(x^\star \).
Let \(G\) be an Abelian group. Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping, where \(G\) is a compact connected Lie group equipped with a bi-invariant Riemannian metric. Let \(x^\star \in G\) be a zero of mapping \(F\) and \(0 {\lt} r {\lt} \displaystyle \tfrac {\alpha }{4 \, L}\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {3 \, r}{1- L_{\star } \, r } )\). Let also \(x_0 \in U (x^\star , r )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \), remains in \(\overline{U} (x^\star , \displaystyle \tfrac {3 \, r}{1- L_{\star } \, r } )\) for all \(n \geq 0\) and converges to \(x^\star \).
Let \(G\) be an Abelian group. Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping, where \(G\) is a compact connected Lie group equipped with a bi-invariant Riemannian metric. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {1}{L_{\star } } )\). Let also \(x_0 \in U (x^\star , \displaystyle \tfrac {\alpha }{4 \, L} )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \), remains in \(\overline{U} (x^\star , \displaystyle \tfrac {\alpha }{4 \, L } )\) for all \(n \geq 0\) and converges to \(x^\star \).
Let \(0{\lt} r \leq \displaystyle \tfrac {1}{L}\). Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1}\) exists; there exist \(j\geq 1\) and \(z_1, z_2, \cdots , z_j \in Q\) such that \( x_0 = x^\star \cdot {\rm exp} \, z_1 \cdots {\rm exp} \, z_j \) for \( \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel {\lt} r \) and \(dF_{x^\star }^{-1} \, dF\) satisfies the \(L\)-Lipschitz condition at \( x^\star \) on \(U (x^\star , r )\). Then, the following assertions hold
- \[ \parallel dF_{x^\star }^{-1} \, (dF_{x^\star \cdot {\rm exp} \, z }- dF_{x^\star } ) \parallel \leq K_{\star } \, \parallel z \parallel \]
for each \(z\in Q\) with \(\parallel z \parallel {\lt} r\) and some \(K_{\star } \in (0, L]\).
- \[ \parallel dF_{x^\star }^{-1} \, (F'(x_0) - F'(x^\star ) ) \parallel \leq L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z _i \parallel \quad for \, \, some \quad L_\star \in (0, L]. \]
Linear mapping \(dF_{x_0}^{-1}\) exists and
\[ \parallel dF _{x_0} ^{-1} \, F(x_0) \parallel \leq \displaystyle \tfrac {\bigg(2 + L \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg) \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel }{2 \, \bigg(1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg)} . \]
This assertion follows from the hypothesis that \(dF_{x^\star }^{-1} \, dF\) satisfies the \(L\)-Lipschitz condition at \( x^\star \) on \(U (x^\star , r )\).
Let \(w_0= x^\star \), \(w_{i+1} = w_i \cdot {\rm exp} \, z_{i+1}\), \(i=1,2, \cdots , j-1\). Then, we have \(w_j = w_{j-1} \cdot {\rm exp} \, z_j= x_0\). Using the \(L\)-Lipschitz condition, we get in turn that
\[ \begin{array}{l} \parallel dF_{x^\star } ^{-1} \, (dF_{w_j} - dF_{x^\star } ) \parallel \\ \leq \parallel dF_{x^\star } ^{-1} \, (dF_{w_{j-1} \cdot {\rm exp} \, z_j} - dF_{w_{j-1}} ) \parallel + \cdots + \parallel dF_{x^\star } ^{-1} \, (dF_{w_{1} \cdot {\rm exp} \, z_2} - dF_{w_{1}} ) \parallel + \\ \parallel dF_{x^\star } ^{-1} \, (dF_{x^\star \cdot {\rm exp} \, z_1} - dF_{x^\star } ) \parallel \\ \leq L (\parallel z_j \parallel + \cdots + \parallel z_2 \parallel ) + K_{\star } \, \parallel z_1 \parallel \\ = L \, (\parallel z_j \parallel + \cdots + \parallel z_1 \parallel ) + (K_{\star } - L) \, \parallel z_1 \parallel \leq L \, (\parallel z_j \parallel + \cdots + \parallel z_1 \parallel ) \end{array} \]which shows (ii).
We have by (ii) that
\[ \parallel dF_{x^\star } ^{-1} \, (dF_{x_0} - dF_{x^\star } ) \parallel \leq L _{\star }\, (\parallel z_j \parallel + \cdots + \parallel z_1 \parallel ) \leq L\, r {\lt} 1 . \]Hence, \(dF_{x_0}^{-1}\) exists and
\begin{equation} \label{3.31} \parallel dF _{x_0} ^{-1} \, dF_{x^\star } \parallel \leq \bigg(1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg) ^{-1} . \end{equation}2.20We have that
\[ \begin{array}{l} dF _{x^\star } ^{-1} \, (F(w_i) - F(w_{i-1})) = dF _{x^\star } ^{-1} \, \displaystyle \int _0 ^1 dF _{w_{i-1} \cdot {\rm exp} \, (t\, z_i)} \, z_i \, dt \\ = \displaystyle \int _0 ^1 dF _{x^\star } ^{-1} \, (dF _{w_{i-1} \cdot {\rm exp} \, (t\, z_i) } - dF_{x^\star } ) \, z _i \, dt + z_i. \end{array} \]Hence, we get that
\[ \parallel dF _{x^\star } ^{-1} \, (dF _{w_{k-1} \cdot {\rm exp} \, (z_k) } \ - dF_{w_{k-1} } ) \parallel \leq L\, \parallel z_k \parallel \quad {\rm for \, \, each} \quad k=1,2, \cdots , j . \]Therefore, we obtain that
\[ \begin{array}{lll} \| dF _{x^\star } ^{-1} \, (dF _{w_{i-1} \cdot {\rm exp} \, (t\, z_i) } - dF_{x^\star } ) \| \leq \\ \leq \displaystyle \textstyle \sum \limits _{k=1}^{i-1} \| dF _{x^\star } ^{-1} (dF _{w_{k-1} \cdot {\rm exp} ( z_k) } - dF_{ w_{k-1} } ) \| + \| dF _{x^\star } ^{-1} (dF _{w_{i-1} \cdot {\rm exp} (t w_i) } - dF_{ w_{i-1} } ) \| \\ \leq L \displaystyle \textstyle \sum \limits _{k=1}^{i-1} \| z_k \| + L t \| z_i \| . \end{array} \]We have that \(F(x_0) =\displaystyle \textstyle \sum \limits _{i=1}^{j} (F(w_i) - F(w_{i-1} )) \). That is, we can get
\begin{equation} \label{3.32} \begin{array}{l} \parallel dF _{x^\star } ^{-1} \, F(x_0) \parallel \\ \leq \displaystyle \textstyle \sum \limits _{i=1}^{j} {\bigg(} \displaystyle \int _0 ^1 \parallel dF _{x^\star } ^{-1} \, (dF _{w_{i-1} \cdot {\rm exp} \, (t\, z_i) } - dF_{ x^\star } ) \parallel \, \parallel z_i \parallel \, dt + \parallel z_i \parallel {\bigg)} \\ \leq \displaystyle \textstyle \sum \limits _{i=1}^{j} {\bigg(} \displaystyle \int _0 ^1 L\, {\bigg(} \displaystyle \textstyle \sum \limits _{k=1}^{i-1} \parallel z_k \parallel + t\, \parallel z_i \parallel {\bigg)} \, \parallel z_i \parallel \, dt + \parallel z_i \parallel {\bigg)} \\ \leq {\bigg(} \displaystyle \tfrac {L}{2} \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel +1 {\bigg)} \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel . \end{array} \end{equation}2.21The result now follows from (2.20), (2.21) and
\[ \parallel dF _{x_0} ^{-1} \, F (x_0) \parallel \leq \parallel dF _{x_0} ^{-1} \, dF _{x^\star } \parallel \, \parallel dF _{x^\star }^{-1} \, F (x_0) \parallel \leq \displaystyle \tfrac {\bigg(2 + L \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg) \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel }{2 \, \bigg(1- L_{\star } \, \displaystyle \textstyle \sum \limits _{i=1}^{j} \parallel z_i \parallel \bigg)} . \]
The proof of Lemma 2.10 is complete.
Let us define parameter \(\delta \) by
We have that
Let \(0 {\lt} r \leq \displaystyle \tfrac {\delta }{4 \, L}\), where \(\delta \) in given in (2.22). Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Suppose there exists \(x^\star \in G\) such that \(F (x^\star )=0\), \(dF_{x^\star }^{-1}\) exists and \(dF_{x^\star }^{-1} \, dF\) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , R= \displaystyle \tfrac {3 + (L- L_{\star } ) \, r}{1 - L_{\star } \, r} \, r )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \in U (x^\star , r)\) is well defined, remains in \(U (x^\star ,R )\) for all \(n\geq 0\) and converges to a zero \(y^\star \) of mapping \(F\) such that \(m( x^\star , y^\star ) {\lt} R\).
and \(\eta \) satisfies the new condition
The proof of Theorem 2.11 is complete.
Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {1}{L_{\star } } )\). Let also \(x_0 \in U (x^\star , \displaystyle \tfrac {\delta } {4\, L } )\). Then, the conclusions of Corollary 2.6 hold.
Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {1}{L_{\star } } )\). Let also \(x_0 \in U (x^\star , \displaystyle \tfrac {1} {L_{\star } } )\). Let \(\sigma \) and \(N(x^\star , r)\) as in Corollary 2.7. Set \( r = \min {\Big\{ } r_0 , \, r_1 , \, \displaystyle \tfrac {\delta }{4 \, L} {\Big\} } \), where
and
Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \in N (x^\star ,r) \) converges to \(x^\star \).
Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping, where \(G\) is a compact connected Lie group equipped with a bi-invariant Riemannian metric. Let \(x^\star \in G\) be a zero of mapping \(F\) and \(0 {\lt} r {\lt} \displaystyle \tfrac {\delta }{4 \, L}\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , R )\), where \(R \) is defined in Theorem 2.11. Let also \(x_0 \in U (x^\star , r )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \), remains in \(\overline{U} (x^\star , R )\) for all \(n \geq 0\) and converges to \(x^\star \).
Let \(F \, : \, G \longrightarrow Q\) be a \(\mathcal{C}^1 \)-mapping, where \(G\) is a compact connected Lie group equipped with a bi-invariant Riemannian metric. Let \(x^\star \in G\) be a zero of mapping \(F\). Suppose \(dF_{x^\star }^{-1} \) exists and \(dF_{x^\star }^{-1} \, dF \) satisfies the \(L\)-Lipschitz condition on \(U (x^\star , \displaystyle \tfrac {1}{L_{\star } } )\). Let also \(x_0 \in U (x^\star , \displaystyle \tfrac {\delta }{4 \, L} )\). Then, sequence \(\{ x _n\} \) (\(n \geq 0\)) generated by Newton’s method starting at \(x_0 \), remains in \(\overline{U} (x^\star , \displaystyle \tfrac {\delta }{4 \, L } )\) for all \(n \geq 0\) and converges to \(x^\star \).
(a) The local results reduce to the corresponding ones in [ 56 ] if \(L = L_{\star }\). Otherwise (i.e., if \(L_{\star } {\lt} L\)), they constitute an improvement under the same computational cost with advantages as already stated in the Introduction of this study. Note also that \(\alpha {\gt}1\), \(\delta {\gt}1\) if \(L_{\star } {\lt}L\) and \(\alpha \longrightarrow \infty \), \(\delta \longrightarrow \infty \) if \(\displaystyle \tfrac {L}{L_{\star }}\longrightarrow \infty \).
(b) The local results if \(G\) is an Abelian group are weaker and tighter than the ones when \(G\) is not necessarily an Abelian group. We have for example that \(\overline{r} {\lt} \overline{\overline{r}} \), \(\delta {\lt} \alpha \), \(\displaystyle \tfrac {3 \, r}{1- L_{\star } \, r} {\lt} R\) and the upper bound on \(\eta \) is smaller if \(L_{\star } {\lt} L\).
(c) It is obvious that finer results can be immediately obtained if similar conditions as the semilocal case (see [ 15 , Section 1, L⋆ instead L0 ] ) are used instead of Kantorovich condition for \(L_{\star } {\lt} L\). However, we decided to leave this part of analysis to the motivated reader. We refer the reader to [ 14 ] for such results involving nonlinear equations in a Banach space setting. We also refer the reader to [ 7 , 11 , 15 , 16 ] for examples.
3 Numerical examples
In this Section we present two numerical examples in the more general setting of a nonlinear equation on a Hilbert space where \(L_\star {\lt} L\).
Let \(\mathcal X= \mathcal Y= \mathbb {R}\), \(x^\star =0\). Define \(F\) by \(F(x)=-d_2 \sin 1 + d_1 \, x + d_2 \, \sin {\rm e}^{d_3 \, x} \), where \(d_1\), \(d_2\), \(d_3\) are given real numbers. We have that \(F(x^\star ) =0\). Moreover, if \(d_3\) is sufficiently large and \(d_2\) sufficiently small, \(L/ L_\star \) can be arbitrarily large.
Let \(\mathcal X= \mathcal Y= \mathbb {R} ^3\), \(\mathcal D= \overline{U} (0,1)\) and \(x^\star = (0,0,0)\). Define function \(F\) on \(\mathcal D\) for \(w= (x,y,z)\) by
Then, the Fréchet derivative of \(F\) is given by
Notice that we have \(F(x^\star ) = 0\), \(F'( x^\star ) = F ' (x^\star ) ^{-1} = {\rm diag } \, \{ 1 , 1 , 1\} \) and \(L_{\star } = {\rm e}-1 {\lt} L= {\rm e}\).
Bibliography
- 1
- 2
- 3
- 4
- 5
Argyros, I.K., On the local convergence of Newton’s method on Lie groups, Pan Amer. Math. J., 17 (2007), pp. 101–109.
- 6
Argyros, I.K., A Kantorovich analysis of Newton’s method on Lie groups, J. Concrete Appl. Anal., 6 (2008), pp. 21–32.
- 7
- 8
- 9
Argyros, I.K., Newton’s method on Lie groups, J. Appl. Math. Comput., 31 (2009), pp. 217–228.
- 10
- 11
Argyros, I.K., Cho, Y.J. and Hilout, S., Numerical methods for equations and its applications, CRC Press/Taylor and Francis Group, New-York, 2012.
- 12
- 13
- 14
- 15
Argyros, I.K. and Hilout, S., Extending the applicability of Newton’s method on Lie groups, Appl. Math. Comput., to appear.
- 16
Argyros, I.K. and Hilout, S., Numerical methods in nonlinear analysis, World Scientific Publ., ISBN-13: 978-9814405829, to appear.
- 17
- 18
Brockett, R.W., Differential geometry and the design of gradient algorithms. Differential geometry: partial differential equations on manifolds (Los Angeles, CA, 1990), pp. 69–92, Proc. Sympos. Pure Math., 54, Part 1, Amer. Math. Soc., Providence, RI, 1993.
- 19
- 20
- 21
Do Carmo, M.P., Riemannian Geometry, Boston, Birkhauser, 1992.
- 22
- 23
- 24
Ezquerro, J.A. and Hernández, M.A., On an application of Newton’s method to nonlinear operators with \(\omega \)-conditioned second derivative, BIT, 42 (2002), pp. 519–530.
- 25
- 26
- 27
- 28
Hall, B.C., Lie groups, Lie algebras and representations. An elementary introduction, Graduate Texts in Mathematics, 222, Springer-Verlag, New York, 2003.
- 29
Helgason, S., Differential geometry, Lie groups and symmetric spaces, Pure and Applied Mathematics, 80, Academic Press, Inc. Harcourt Brace Jovanovich, Publishers, New York-London, 1978.
- 30
- 31
Helmke, U. and Moore, J.B., Optimization and dynamical systems. With a foreword by R. Brockett, Communications and Control Engineering Series. Springer-Verlag London, Ltd., London, 1994.
- 32
Kantorovich, L.V. and Akilov, G.P., Functional analysis in normed spaces, Pergamon Press, New York, 1982.
- 33
- 34
- 35
- 36
- 37
- 38
Luenberger, D.G., The gradient projection method along geodesics, Management Sci., 18 (1972), pp. 620–631.
- 39
Mahony, R.E., Optimization algorithms on homogeneous spaces: with applications in linear systems theory, Ph.D. Thesis, Department of Systems Engineering, University of Canberra, Australia, 1994.
- 40
- 41
Mahony, R.E., Helmke, U. and Moore, J.B., Pole placement algorithms for symmetric realisations, Proceedings of 32nd IEEE Conference on Decision and Control, San Antonio, TX, 1993, pp. 355–1358.
- 42
- 43
Owren, B. and Welfert, B., The Newton iteration on Lie groups, BIT, 40 (2000), pp. 121–145.
- 44
- 45
Rheinboldt, W.C., An adaptive continuation process for solving system of nonlinear equations, Banach Center Publ., 3 (1977), pp. 129–142.
- 46
- 47
Smale, S., Newton’s method estimates from data at a point, The merging of disciplines: New directions in Pure, Applied and Computational Mathematics, R. Ewing, K. Gross and C. Martin eds, New-York, Springer Verlag Publ., (1986), pp. 185–196.
- 48
- 49
Smith, S.T., Smith, S.T., Geometric optimization methods for adaptive filtering, Thesis (Ph.D.)–Harvard University, ProQuest LLC, Ann Arbor, MI, 1993.
- 50
Smith, S.T., Smith, S.T., Optimization techniques on Riemannian manifolds, Hamiltonian and gradient flows, algorithms and control, Fields Inst. Commun., 3, Amer. Math. Soc., Providence, RI, 1994, pp. 113–136.
- 51
- 52
Udrişte, C., Convex functions and optimization methods on Riemannian manifolds, Mathematics and its Applications, 297, Kluwer Academic Publishers Group, Dordrecht, 1994.
- 53
Varadarajan, V.S., Lie groups, Lie algebras and their representations, Reprint of the 1974 edition. Graduate Texts in Mathematics, 102, Springer-Verlag, New York, 1984.
- 54
- 55
- 56
- 57
- 58