<!DOCTYPE html>
<html lang="en">
<head>
<script>
  MathJax = { 
    tex: {
		    inlineMath: [['\\(','\\)']]
	} }
</script>
<script type="text/javascript" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js">
</script>
<meta name="generator" content="plasTeX" />
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Low-rank matrix approximations over canonical subspaces: Low-rank matrix approximations over canonical subspaces</title>
<link rel="stylesheet" href="/var/www/clients/client1/web1/web/files/jnaat-files/journals/1/articles/styles/theme-white.css" />
</head>

<body>

<div class="wrapper">

<div class="content">
<div class="content-wrapper">


<div class="main-text">



<div class="titlepage">
<h1>Low-rank matrix approximations over canonical subspaces</h1>
<p class="authors">
<span class="author">Achiya Dax\(^\ast \)</span>
</p>
<p class="date">September 15, 2019; accepted: December 7, 2019; published online: August 11, 2020.</p>
</div>
<div class="abstract"><p> In this paper we derive closed form expressions for the nearest rank-\(k\) matrix on canonical subspaces. We start by studying three kinds of subspaces. Let \(X\) and \(Y\) be a pair of given matrices. The first subspace contains all the \(m\times n\) matrices \(A\) that satisfy \(AX=O\). The second subspace contains all the \(m \times n\) matrices \(A\) that satisfy \(Y^TA = O\), while the matrices in the third subspace satisfy both \(AX =O\) and \(Y^TA = 0\). </p>
<p>The second part of the paper considers a subspace that contains all the symmetric matrices \(S\) that satisfy \(SX =O\). In this case, in addition to the nearest rank-\(k\) matrix we also provide the nearest rank-\(k\) positive approximant on that subspace. A further insight is gained by showing that the related cones of positive semidefinite matrices, and negative semidefinite matrices, constitute a polar decomposition of this subspace. </p>
<p>The paper ends with two examples of applications. The first one regards the problem of computing the nearest rank-\(k\) centered matrix, and adds new insight into the PCA of a matrix. </p>
<p>The second application comes from the field of Euclidean distance matrices. The new results on low-rank positive approximants are used to derive an explicit expression for the nearest source matrix. This opens a direct way for computing the related positions matrix. </p>
<p><b class="bf">MSC.</b> 15A03, 15A18, 15A21, 15A42, 15A60, 65F99. </p>
<p><b class="bf">Keywords.</b> Canonical subspaces, Low-rank positive approximants, The nearest rank-\(k\) centered matrix, The nearest source matrix. </p>
</div>
<p>\(^\ast \)Hydrological Service, P.O.B. 36118, Jerusalem 91360, Israel, e-mail: <span class="tt">dax20@water.gov.il</span>. </p>
<h1 id="a0000000002">1 Introduction</h1>
<p>In this paper we study matrix approximation problems that involve subspaces of matrices. Let \(X \in \mathbb {R}^{n\times p}\) and \(Y\in \mathbb {R}^{m\times q}\) be a pair of given matrices. Then the term “canonical subspace" refers to the following types of matrix subspaces. </p>
<div class="equation" id="1.1">
<p>
  <div class="equation_content">
    \begin{equation} \label{1.1} \mathbb {X}= \{  B \,  | \,  B \in \mathbb {R}^{m\times n} \;  \,  \hbox{and\  } \,  BX = O\} ,\end{equation}
  </div>
  <span class="equation_label">1.1</span>
</p>
</div>
<div class="equation" id="1.2">
<p>
  <div class="equation_content">
    \begin{equation} \label{1.2} \mathbb {Y}= \{  B\,  | \,  B \in \mathbb {R}^{m\times n} \;  \,  \hbox{and\  } \,  Y^TB = O\} ,\end{equation}
  </div>
  <span class="equation_label">1.2</span>
</p>
</div>
<div class="equation" id="1.3">
<p>
  <div class="equation_content">
    \begin{equation} \label{1.3} \mathbb {Z}= \{  B\,  | \,  B \in \mathbb {R}^{m\times n}, \;  Y^T B = O \;  \,  \hbox{and\  } \,  BX = O\} ,\end{equation}
  </div>
  <span class="equation_label">1.3</span>
</p>
</div>
<p>and </p>
<div class="equation" id="1.4">
<p>
  <div class="equation_content">
    \begin{equation} \label{1.4} \mathbb {S}= \{ S \,  | \,  S \in \mathbb {R}^{n \times n}, \;  S = S^T \;  \,  \hbox{and\  } \,  S X = O\} .\end{equation}
  </div>
  <span class="equation_label">1.4</span>
</p>
</div>
<p>Note that \(S \in \mathbb {S}\) implies \(X^TS = O\). The matrix \(O\) denotes a null matrix with appropriate dimensions. </p>
<p>The plan of the paper is as follows. It starts by deriving explicit solutions to matrix nearness problems of the form </p>
<div class="equation" id="1.5">
<p>
  <div class="equation_content">
    \begin{equation} \label{1.5} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(B) = \|  A - B\| \\ & {\hbox{\rm subject to \  }}\;  \;  \;  B \in \mathbb {B}, \end{aligned} \end{equation}
  </div>
  <span class="equation_label">1.5</span>
</p>
</div>
<p> where \(A \in \mathbb {R}^{m\times n}\) is a given matrix, \(\mathbb {B}\) denotes one of the subspaces, \(\mathbb {X}, \mathbb {Y}\) or \(\mathbb {Z}\), and \(\|  \cdot \| \) denotes a unitarily invariant norm. In the third section we show that the SVD of a matrix \(B \in \mathbb {B}\) has a special structure. This observation paves the way for solving low-rank approximations problems of the form </p>
<div class="equation" id="1.6">
<p>
  <div class="equation_content">
    \begin{equation} \label{1.6} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f (B) = \|  A - B \| \\ & {\hbox{\rm subject to \  }}\;  \;  \;  B \in \mathbb {B}\;  \,  \hbox{and\  } \,  {\rm rank}(B) \le k, \end{aligned} \end{equation}
  </div>
  <span class="equation_label">1.6</span>
</p>
</div>
<p> where \(k\) denotes the desired matrix rank. </p>
<p>The second part of the paper concentrates on symmetric matrices. It starts by solving <a href="#1.5" class="eqref">1.5</a> and <a href="#1.6" class="eqref">1.6</a> when \(\mathbb {B}= \mathbb {S}\). Then we move to matrix nearness problems that seek (low-rank) positive semidefinite matrices in \(\mathbb {S}\). It is shown that the sets </p>
<div class="equation" id="1.7">
<p>
  <div class="equation_content">
    \begin{equation} \label{1.7} \mathbb {S}_+ = \{  S \,  | \,  S \in \mathbb {S}\;  \;  \hbox{and \  } \;  S \ge 0\} \end{equation}
  </div>
  <span class="equation_label">1.7</span>
</p>
</div>
<div class="equation" id="1.8">
<p>
  <div class="equation_content">
    \begin{equation} \label{1.8} \mathbb {S}_- = \{  S \,  | \,  S \in \mathbb {S}\;  \;  \hbox{and \  } \;  S \le 0\} \end{equation}
  </div>
  <span class="equation_label">1.8</span>
</p>
</div>
<p>are convex cones in \(\mathbb {S}\). The notation \(S \ge 0\) means that the symmetric matrix \(S\) is positive semidefinite. Similarly \(S \le 0\) means that \(S\) is negative semidefinite. With these notations at hand the problems that we solve are </p>
<div class="equation" id="1.9">
<p>
  <div class="equation_content">
    \begin{equation} \label{1.9} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(S) = \|  A - S\| _F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  S \in \mathbb {S}_+, \end{aligned} \end{equation}
  </div>
  <span class="equation_label">1.9</span>
</p>
</div>
<p> and </p>
<div class="equation" id="1.10">
<p>
  <div class="equation_content">
    \begin{equation} \label{1.10} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f (S) = \|  A - S\| _F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  S \in \mathbb {S}_+ \;  \,  \hbox{and\  } \,  {\rm rank}(S) \le k, \end{aligned} \end{equation}
  </div>
  <span class="equation_label">1.10</span>
</p>
</div>
<p> where here \(A\) is an arbitrary matrix from \(\mathbb {R}^{n\times n}\). It is shown that these problems have closed form solutions, and if \(A\) belongs to \(\mathbb {S}\) then the derived solution remains valid in any unitarily invariant norm. The role of \(\mathbb {S}_+\) and \(\mathbb {S}_-\) is clarified by showing that this pair of convex cones constitutes a polar decomposition of \(\mathbb {S}\). </p>
<p>The paper ends with two examples of applications. In Section 7 the properties of canonical subspaces are used to compute the nearest rank-\(k\) centered matrix. The results add new insight into the PCA of a matrix. </p>
<p>In Section 8 we solve matrix nearness problems from the field of <i class="itshape">Euclidean Distance</i> (ED) matrices. Given a “predistance" matrix, \(A\), it is desired to compute the nearest “source" matrix and the related “positions" matrix. As we shall see, the results on positive approximants enable us to derive closed form solutions to these problems. </p>
<h1 id="a0000000003">2 The nearest matrix on a subspace</h1>
<p>We shall start by deriving equivalent presentations to the subspaces \(\mathbb {X}\), \(\mathbb {Y}\), and \(\mathbb {Z}\), which are defined in <a href="#1.1" class="eqref">1.1</a>–<a href="#1.3" class="eqref">1.3</a>. Using a SVD of \(X\), or a QR factorization, it is possible to compute a matrix with orthonormal columns, \(\hat X\), that has the following property: A vector \({\boldsymbol {v}}\in \mathbb {R}^n\) satisfies \({\boldsymbol {v}}^TX = {\boldsymbol {o}}\) if and only if \({\boldsymbol {v}}^T\hat X = {\boldsymbol {o}}\). The number of columns in \(\hat X\) equals rank\((X)\). Yet, for the sake of simplicity, it is possible to assume that rank\((X) = p\). Replacing \(X\) with \(\hat X\) turns <a href="#1.1" class="eqref">1.1</a> into the form </p>
<div class="equation" id="2.1">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.1} \mathbb {X}= \{  B \,  | \,  B \in R^{m \times n}\;  \,  \hbox{ and\  }\,  B\hat X = O\} .\end{equation}
  </div>
  <span class="equation_label">2.1</span>
</p>
</div>
<p>Similar arguments allow us to replace \(Y\) with a matrix \(\hat Y \in \mathbb {R}^{m\times q}\) that has orthonormal columns. This turns <a href="#1.2" class="eqref">1.2</a> and <a href="#1.3" class="eqref">1.3</a> into the forms </p>
<div class="equation" id="2.2">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.2} \mathbb {Y}= \{  B \,  | \,  B \in \mathbb {R}^{m \times n} \;  \;  {\hbox{\rm and \  }}\hat Y^TB = O \}  \end{equation}
  </div>
  <span class="equation_label">2.2</span>
</p>
</div>
<p>and </p>
<div class="equation" id="2.3">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.3} \mathbb {Z}= \{  B \,  | \,  B \in \mathbb {R}^{m \times n}, \;  \,  \hat Y^TB = 0 \;  \;  {\hbox{\rm and \  }}B \hat X = O \} , \end{equation}
  </div>
  <span class="equation_label">2.3</span>
</p>
</div>
<p>respectively. </p>
<p>Another equivalent presentation is derived in the following way. Let the \(n\times (n-p)\) matrix \(\tilde X\) be obtained by completing the columns of \(\hat X\) to be an orthonormal basis of \(\mathbb {R}^n\). Then the \(n\times n\) matrix \([\tilde X, \hat X]\) satisfies </p>
<div class="equation" id="2.4">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.4} [\tilde X, \hat X]^T [\tilde X, \hat X] = I = [\tilde X, \hat X] [\tilde X, \hat X]^T.\end{equation}
  </div>
  <span class="equation_label">2.4</span>
</p>
</div>
<p>Observe that the first equality in <a href="#2.4" class="eqref">2.4</a> means </p>
<div class="equation" id="2.5">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.5} \tilde X^T \tilde X = I,\,  \;  \hat X^T\hat X= I, \;  \;  \tilde X^T\hat X = O, \; \,  {\hbox{\rm and \  }}\;  \hat X^T \tilde X = O;\end{equation}
  </div>
  <span class="equation_label">2.5</span>
</p>
</div>
<p>while the second equality implies </p>
<div class="equation" id="2.6">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.6} \tilde X \tilde X^T + \hat X \hat X^T = I\end{equation}
  </div>
  <span class="equation_label">2.6</span>
</p>
</div>
<p>and </p>
<div class="equation" id="2.7">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.7} \tilde X \tilde X^T = I - \hat X \hat X^T.\end{equation}
  </div>
  <span class="equation_label">2.7</span>
</p>
</div>
<p>(The dimensions of the unit matrix, \(I\), and the null matrix, \(O\), depend on the context.) Now let \(B\) be some matrix in \(\mathbb {X}\). Then from <a href="#2.6" class="eqref">2.6</a> we see that </p>
<div class="displaymath" id="a0000000004">
  \[  B = B (\tilde X \tilde X^T + \hat X \hat X^T) = B\tilde X \tilde X^T = R\tilde X^T, \]
</div>
<p> where \(R = B\tilde X\in \mathbb {R}^{m \times (n-p)}\). Conversely, given a matrix \(R \in \mathbb {R}^{m \times (n-p)}\) then the equality \(\tilde X^T \hat X = O\) implies that the matrix \(R\tilde X^T\) belongs to \(\mathbb {X}\). This enables us to rewrite \(\mathbb {X}\) in the form </p>
<div class="equation" id="2.8">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.8} \mathbb {X}= \{  R\tilde X^T \,  | \,  R \in \mathbb {R}^{m \times (n-p)} \} .\end{equation}
  </div>
  <span class="equation_label">2.8</span>
</p>
</div>
<p>That is, \(\mathbb {X}\) is essentially a subspace which contains all the \(m\times n\) matrices whose rows belong to \(\hbox{\rm Range}(\tilde X)\). </p>
<p>Similarly, let the \(m \times (n-q)\) matrix \(\tilde Y\) be obtained by completing the columns of \(\hat Y\) to be an orthonormal basis of \(\mathbb {R}^m\). Then the subspaces \(\mathbb {Y}\) and \(\mathbb {Z}\) have equivalent presentations in the forms </p>
<div class="equation" id="2.9">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.9} \mathbb {Y}= \{  \tilde Y R \,  | \,  R\in \mathbb {R}^{(m-q)\times n}\} \end{equation}
  </div>
  <span class="equation_label">2.9</span>
</p>
</div>
<p>and </p>
<div class="equation" id="2.10">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.10} \mathbb {Z}= \{  \tilde Y R \tilde X^T\,  | \,  R\in \mathbb {R}^{(m-q)\times (n-p)}\} .\end{equation}
  </div>
  <span class="equation_label">2.10</span>
</p>
</div>
<p>Let \(\|  \cdot \|  \) denote any unitarily invariant norm on \(\mathbb {R}^{m \times n}\). To find the nearest matrix on \(\mathbb {X}\), or \(\mathbb {Y}\), we need the following observation. <div class="lem_thmwrapper " id="lm.1">
  <div class="lem_thmheading">
    <span class="lem_thmcaption">
    Lemma
    </span>
    <span class="lem_thmlabel">2.1</span>
  </div>
  <div class="lem_thmcontent">
  <p> Let \(H\in \mathbb {R}^{m \times n}\) be a given matrix, and let the matrix \(H_\ell \in \mathbb {R}^{m \times n}\) be obtained from \(H\) by replacing the last \(n-\ell \) columns of \(H\) with zero columns. Then </p>
<div class="equation" id="2.11">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.11} \|  H_\ell \|  \le \|  H \| .\end{equation}
  </div>
  <span class="equation_label">2.11</span>
</p>
</div>

  </div>
</div> <div class="proof_wrapper" id="a0000000005">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> Using Cauchy interlacing theorem on the matrices \(H^T H\) and \(H^T_\ell H_\ell \) shows that the singular values of \(H_\ell \) are weakly majorized by those of \(H\). Therefore <a href="#2.11" class="eqref">2.11</a> is a consequence of Ky Fan dominance theorem <span class="cite">
	[
	<a href="#15" >15</a>
	]
</span>. <div class="proof_wrapper" id="a0000000006">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> </p>
<p><div class="thm_thmwrapper " id="th.2">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">2.2</span>
    <span class="thm_thmtitle">The nearest matrix on \(\mathbb {X}\)</span>
  </div>
  <div class="thm_thmcontent">
  <p>  Let \(A\) be a given matrix in \(\mathbb {R}^{m \times n}\) and consider the problem </p>
<div class="equation" id="2.12">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.12} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(B) = \|  A - B\| \\ & {\hbox{\rm subject to \  }}\;  \;  \;  B \in \mathbb {X}, \end{aligned} \end{equation}
  </div>
  <span class="equation_label">2.12</span>
</p>
</div>
<p> where \(\|  \cdot \|  \) is a unitarily invariant norm on \(\mathbb {R}^{m \times n}\). Then the matrix </p>
<div class="equation" id="2.13">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.13} A \tilde X \tilde X^T = A (I - \hat X \hat X^T)\end{equation}
  </div>
  <span class="equation_label">2.13</span>
</p>
</div>
<p>solves this problem. </p>

  </div>
</div> <div class="proof_wrapper" id="a0000000007">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> The equality in <a href="#2.13" class="eqref">2.13</a> is a direct consequence of <a href="#2.7" class="eqref">2.7</a>. Using <a href="#2.8" class="eqref">2.8</a> we can write \(B = R\tilde X^T\) for some \(R \in \mathbb {R}^{m \times (n-p)}\). Then, since \(\|  \cdot \| \) is unitarily invariant </p>
<div class="displaymath" id="a0000000008">
  \begin{align*}  \|  A - R\tilde X^T\|  & = \|  (A - R\tilde X^T) [\tilde X, \hat X]\| \\ & =\|  [A\tilde X, A\hat X] - [R, O] \|  \ge \|  [O, A\hat X]\| ,\end{align*}
</div>
<p> where the last equality is due to <a href="#lm.1">lemma 2.1</a>. Consequently the choice \(R = A\tilde X\) gives the minimal value. <div class="proof_wrapper" id="a0000000009">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> </p>
<p>The proof of the next theorem is derived in a similar way. </p>
<p><div class="thm_thmwrapper " id="th.3">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">2.3</span>
    <span class="thm_thmtitle">The nearest matrix on \(\mathbb {Y}\)</span>
  </div>
  <div class="thm_thmcontent">
  <p>  Let \(A\) and \(\|  \cdot \| \) be as above and consider the problem </p>
<div class="equation" id="2.14">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.14} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(B) = \|  A - B\| \\ & {\hbox{\rm subject to \  }}\;  \;  \;  B \in \mathbb {Y}. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">2.14</span>
</p>
</div>
<p> Then the matrix </p>
<div class="equation" id="2.15">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.15} \tilde Y \tilde Y^T A = (I-\hat Y \hat Y^T)A\end{equation}
  </div>
  <span class="equation_label">2.15</span>
</p>
</div>
<p>solves this problem. </p>

  </div>
</div> </p>
<p>The structure of \(\mathbb {Z}\) is slightly more complicated and, therefore, the nearest matrix is computed with respect to one norm, the Frobenius matrix norm \(\| \cdot \| _F\). (The reasons are explained after the proof of <a href="#th.4">theorem 2.4</a>.) For the sake of clarity we mention that for any matrix \(A = (a_{ij}) \in \mathbb {R}^{m \times n}\) the Frobenius norm of \(A\) is defined as </p>
<div class="displaymath" id="a0000000010">
  \[  \|  A\| _F = \left( \sum \limits ^m_{i = 1} \sum \limits ^n_{j = 1} (a_{ij})^2\right)^{1/2}, \]
</div>
<p> and </p>
<div class="displaymath" id="a0000000011">
  \[  \|  A\| ^2_F = \hbox{\rm trace}(A^TA) = \hbox{\rm trace}(AA^T). \]
</div>
<p><div class="thm_thmwrapper " id="th.4">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">2.4</span>
    <span class="thm_thmtitle">The nearest matrix on \(\mathbb {Z}\)</span>
  </div>
  <div class="thm_thmcontent">
  <p>  Let \(A\) be a given matrix in \(\mathbb {R}^{m \times n}\) and consider the problem </p>
<div class="equation" id="2.16">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.16} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(Z) = \|  A - Z\| _F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  Z \in \mathbb {Z}. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">2.16</span>
</p>
</div>
<p> Then the matrix </p>
<div class="equation" id="2.17">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.17} \tilde Y \tilde Y^T A\tilde X \tilde X^T = (I - \hat Y \hat Y^T) A (I - \hat X \hat X^T)\end{equation}
  </div>
  <span class="equation_label">2.17</span>
</p>
</div>
<p>solves this problem. </p>

  </div>
</div> <div class="proof_wrapper" id="a0000000012">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div>Using <a href="#2.10" class="eqref">2.10</a> it is possible to rewrite <a href="#2.16" class="eqref">2.16</a> in the form </p>
<div class="equation" id="2.18">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.18} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  g(R) = \|  A - \tilde Y R\tilde X^T\| ^2_F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  R \in \mathbb {R}^{(m - q)\times (n-p)}. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">2.18</span>
</p>
</div>
<p> Then, since the Frobenius norm is unitarily invariant, </p>
<div class="displaymath" id="a0000000013">
  \begin{align*}  \|  A -\tilde Y R\tilde X^T\| ^2_F & = \Big\|  [\tilde Y, \hat Y]^T (A - \tilde Y R\tilde X^T) [\tilde X, \hat X] \Big\| ^2_F \\ & = \Big\|  [\tilde Y, \hat Y]^T A [\tilde X, \hat X] - [\tilde Y, \hat Y]^T\tilde Y R\tilde X^T [\tilde X, \hat X] \Big\| ^2_F. \end{align*}
</div>
<p> Now comparing the matrices </p>
<div class="displaymath" id="a0000000014">
  \[ [\tilde Y, \hat Y]^T A [\tilde X, \hat X] = \left[\begin{array}{c@{}c@{}c} \tilde Y^T A \tilde X & \;  |\;  &  \tilde Y^T A \hat X\\ \hline \hat Y^T A \tilde X & \;  |\;  &  \hat Y^T A \hat X \end{array}\right] \]
</div>
<p> and </p>
<div class="displaymath" id="a0000000015">
  \[  [\tilde Y, \hat Y]^T \tilde Y R\tilde X^T [\tilde X, \hat X] = \left[\begin{array}{c@{}c@{}c} R &  \;  |\;  &  O\\ \hline O &  \;  |\;  &  O\end{array}\right]  \]
</div>
<p> shows that the optimal choice of \(R\) is </p>
<div class="equation" id="2.19">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.19} R = \tilde Y^T A\tilde X.\end{equation}
  </div>
  <span class="equation_label">2.19</span>
</p>
</div>
<p><div class="rem*_thmwrapper " id="a0000000016">
  <div class="rem*_thmheading">
    <span class="rem*_thmcaption">
    Remark
    </span>
  </div>
  <div class="rem*_thmcontent">
  <p>It is tempting to think that the choice <a href="#2.19" class="eqref">2.19</a> is also optimal for any other unitarily invariant norm. However, as the following example shows, setting to zero the north-west corner of a matrix does not necessarily reduce the matrix norm. To see this point we consider the matrices </p>
<div class="displaymath" id="a0000000017">
  \[  G = \begin{pmatrix}  1 

& 1 

\\ 1 

& 1

\end{pmatrix}\;  \;  {\hbox{\rm and \  }}\;  \,  H = \begin{pmatrix}  0 

& 1 

\\ 1 

& 1

\end{pmatrix}  \]
</div>
<p> Then the trace norm of \(G\) is 2 while the trace norm of \(H\) is \(\sqrt{5}\). That is, “punching" \(G\) increases its trace norm. (Recall that the trace norm of a matrix is the sum of its singular values.) <span class="qed">â–¡</span></p>

  </div>
</div> </p>
<p>We shall finish this section, by showing that the nearest matrices that we have found are essentially orthogonal projections. Let \(A = (a_{ij})\) and \(B = (b_{ij})\) be two matrices in \(\mathbb {R}^{m\times n}\). Then it is well known that </p>
<div class="equation" id="2.20">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.20} \langle A, B\rangle = \sum \limits ^m_{i = 1} \sum \limits ^n_{j = 1} a_{ij} b_{ij}\end{equation}
  </div>
  <span class="equation_label">2.20</span>
</p>
</div>
<p>is an inner product on \(\mathbb {R}^{m \times n}\), and the related matrix norm is the Frobenius norm </p>
<div class="equation" id="2.21">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.21} \|  A \| _F = ( \langle A, A\rangle )^{1/2}.\end{equation}
  </div>
  <span class="equation_label">2.21</span>
</p>
</div>
<p>Recall also that a matrix \(A\) is “orthogonal" to \(B\) (and vice versa) if </p>
<div class="equation" id="2.22">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.22} \langle A , B\rangle = 0.\end{equation}
  </div>
  <span class="equation_label">2.22</span>
</p>
</div>
<p>The inner product <a href="#2.20" class="eqref">2.20</a> and the Frobenius norm turn \(\mathbb {R}^{m \times n}\) to be a Hilbert space, in which the solutions that we have found are called “orthogonal projections". The next lemma clarifies this feature. It is a general property that holds for any subspace of a Hilbert space. </p>
<p><div class="lem_thmwrapper " id="lm.5">
  <div class="lem_thmheading">
    <span class="lem_thmcaption">
    Lemma
    </span>
    <span class="lem_thmlabel">2.5</span>
  </div>
  <div class="lem_thmcontent">
  <p> Let \(\mathbb {B}\) denote one of the subspaces \(\mathbb {X}, \mathbb {Y}\), or \(\mathbb {Z}\). Let \(A\) be an arbitrary matrix in \(\mathbb {R}^{m\times n}\). Then there exists a unique matrix \(\hat B \in \mathbb {B}\) that solves the problem </p>
<div class="equation" id="2.23">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.23} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(B) = \|  A - B\| _F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  B \in \mathbb {B}. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">2.23</span>
</p>
</div>
<p>Moreover, the matrix \(A - \hat B\) is orthogonal to any matrix \(B \in \mathbb {B}\). That is, </p>
<div class="equation" id="2.24">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.24} \langle A - \hat B, \,  B \rangle = 0 \;  \;  \;  \forall B \in \mathbb {B}.\end{equation}
  </div>
  <span class="equation_label">2.24</span>
</p>
</div>

  </div>
</div> </p>
<p>It is instructive, however, to see why <a href="#2.24" class="eqref">2.24</a> holds on the specific subspaces \(\mathbb {X}, \mathbb {Y}\) and \(\mathbb {Z}\). Let us consider for example the subspace \(\mathbb {X}\). Then here <a href="#2.24" class="eqref">2.24</a> is reduced to </p>
<div class="displaymath" id="a0000000018">
  \[  \langle A - A(I-\hat X\hat X^T),\;  B \rangle = 0 , \quad \forall B \in \mathbb {X} \]
</div>
<p> or </p>
<div class="displaymath" id="a0000000019">
  \[  \langle A\hat X\hat X^T, \;  \,  R\tilde X^T \rangle = 0\;  \,  \;  \forall R \in \mathbb {R}^{m\times (n-p)}.  \]
</div>
<p> Recall that a rank-one matrix in \(\mathbb {R}^{m\times n}\) has the form \({\boldsymbol {u}}{\boldsymbol {v}}^T\in \mathbb {R}^{m \times n}\), where \({\boldsymbol {u}}\in \mathbb {R}^m\) and \({\boldsymbol {v}}\in \mathbb {R}^n\). Now it is easy to verify that the inner product of a rank-one matrix satisfies </p>
<div class="equation" id="2.25">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.25} \langle A, {\boldsymbol {u}}{\boldsymbol {v}}^T\rangle = {\boldsymbol {u}}^T A {\boldsymbol {v}}.\end{equation}
  </div>
  <span class="equation_label">2.25</span>
</p>
</div>
<p>Note also that </p>
<div class="displaymath" id="a0000000020">
  \[  R\tilde X^T = \sum \limits ^{n-p}_{j = 1} {\boldsymbol {r}}_j\tilde{\boldsymbol {x}}^T_j \]
</div>
<p> where \({\boldsymbol {r}}_j\) and \(\tilde{\boldsymbol {x}}_j\) denote the \(j\)th columns of \(R\) and \(\tilde X\). Combining these relations gives </p>
<div class="displaymath" id="a0000000021">
  \[ \langle A \hat X\hat X^T, \;  R\tilde X^T \rangle = \sum \limits ^{n-p}_{j = 1} \langle A\hat X \hat X^T, \;  {\boldsymbol {r}}_j\tilde{\boldsymbol {x}}^T_j \rangle = \sum \limits ^{n-p}_{j = 1} {\boldsymbol {r}}^T_j A \hat X\hat X^T \tilde{\boldsymbol {x}}_j = 0, \]
</div>
<p> where the last equality follows from the orthogonality relation \(\hat X^T \tilde X = O\). The verification of <a href="#2.23" class="eqref">2.23</a> for \(\mathbb {Y}\) and \(\mathbb {Z}\) is done in a similar way. The next result is another well known property of orthogonal projections in Hilbert spaces. </p>
<p><div class="cor_thmwrapper " id="co.6">
  <div class="cor_thmheading">
    <span class="cor_thmcaption">
    Corollary
    </span>
    <span class="cor_thmlabel">2.6</span>
  </div>
  <div class="cor_thmcontent">
  <p> Let \(\mathbb {B}\), \(A\), and \(\hat B\) be as in <span class="rm"><a href="#lm.5">lemma 2.5</a></span>. Then for any matrix \(H \in \mathbb {B}\) we have the equality </p>
<div class="equation" id="2.26">
<p>
  <div class="equation_content">
    \begin{equation} \label{2.26} \|  A - H\| ^2_F = \|  A - \hat B \| ^2_F + \|  \hat B - H \| ^2_F.\end{equation}
  </div>
  <span class="equation_label">2.26</span>
</p>
</div>

  </div>
</div> </p>
<p>In the next section we will use this observation to compute the nearest rank-\(k\) matrix on \(\mathbb {B}\). </p>
<h1 id="a0000000022">3 Low-rank approximations over canonical subspaces</h1>
<p>In this section we solve low-rank approximations problems of the form <a href="#1.6" class="eqref">1.6</a>. Using <a href="#2.26" class="eqref">2.26</a> it is possible to rewrite <a href="#1.6" class="eqref">1.6</a> in the form </p>
<div class="equation" id="3.1">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.1} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(H) = \|  \hat B - H\| ^2_F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {B}\;  {\hbox{\rm and \  }}{\rm rank}(H) \le k. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">3.1</span>
</p>
</div>
<p> The key for solving this problem lies in the following observations. </p>
<p>Let \(B\) be a given matrix in \(\mathbb {R}^{m \times n}\) and let \(r\) denote the rank of \(B\). Then it is well known that \(B\) has a “compact" SVD of the form </p>
<div class="equation" id="3.2">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.2} B = U\Sigma V^T,\end{equation}
  </div>
  <span class="equation_label">3.2</span>
</p>
</div>
<p>where the matrices \(U\) and \(V\) have \(r\) orthonormal columns and </p>
<div class="equation" id="3.3">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.3} \Sigma = \hbox{\rm diag}\{  \sigma _1, \sigma _2, \dots , \sigma _r\} \end{equation}
  </div>
  <span class="equation_label">3.3</span>
</p>
</div>
<p>is a diagonal \(r\times r\) matrix. The diagonal entries of \(\Sigma \) are the nonzero singular values of \(B\). These entries are assumed to be positive and sorted to satisfy </p>
<div class="equation" id="3.4">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.4}\sigma _1\ge \sigma _2 \ge \dots \ge \sigma _r > 0.\end{equation}
  </div>
  <span class="equation_label">3.4</span>
</p>
</div>
<p><div class="thm_thmwrapper " id="th.7">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">3.1</span>
  </div>
  <div class="thm_thmcontent">
  <p> Assume that \(B \in \mathbb {B}\). In this case the matrices \(U\) and \(V\) satisfy the following conditions: </p>

<p><span class="rm">a)</span>  If \(B \in \mathbb {X}\)  then  \(V^T\hat X = 0\). </p>

<p><span class="rm">b)</span>  If \(B \in \mathbb {Y}\)  then  \(\hat Y^T U = 0\). </p>

<p><span class="rm">c)</span>  If \(B \in \mathbb {Z}\)  then  \(\hat Y^T U = 0\)  and  \(V^T\hat X = 0\). </p>

  </div>
</div> <div class="proof_wrapper" id="a0000000023">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> Assume first that \(B \in \mathbb {X}\), which means that </p>
<div class="equation" id="3.5">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.5} U \Sigma V^T \hat X = O.\end{equation}
  </div>
  <span class="equation_label">3.5</span>
</p>
</div>
<p>Let \({\boldsymbol {w}}_j \in \mathbb {R}^r\) denote the \(j\)th column of the matrix \(V^T\hat X\). Then <a href="#3.5" class="eqref">3.5</a> implies that </p>
<div class="equation" id="3.6">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.6} U\Sigma {\boldsymbol {w}}_j = {\boldsymbol {o}}\;  \;  \;  \;  \;  {\hbox{\rm for}} \;  \;  \;  \;  \;  j = 1,\dots , p.\end{equation}
  </div>
  <span class="equation_label">3.6</span>
</p>
</div>
<p>Therefore, since \(U\Sigma \) is an \(m\times r\) matrix that has full column rank, the linear system <a href="#3.6" class="eqref">3.6</a> has a unique solution \({\boldsymbol {w}}_j = {\boldsymbol {o}}\). This proves the first claim. The other claims are proved in similar ways. <div class="proof_wrapper" id="a0000000024">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> </p>
<p>An equivalent way to write <a href="#3.2" class="eqref">3.2</a> is</p>
<div class="equation" id="3.7">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.7} B = \sum \limits ^r_{j = 1} \sigma _j{\boldsymbol {u}}_j{\boldsymbol {v}}^T_j\end{equation}
  </div>
  <span class="equation_label">3.7</span>
</p>
</div>
<p>where \({\boldsymbol {u}}_j \) and \({\boldsymbol {v}}_j\) denote the \(j\)th columns of \(U\) and \(V\), respectively. Let \(k\) be a positive integer that is smaller than \(r\), and let the matrices \(U_k \in \mathbb {R}^{m\times k}\) and \(V_k\in \mathbb {R}^{n\times k}\) be composed from the first \(k\) columns of \(U\) and \(V\), respectively. Similarly, \(\Sigma _k = \hbox{\rm diag}\{  \sigma _1, \sigma _2,\dots , \sigma _k\} \) denote the related \(k\times k\) diagonal matrix. Then the matrix </p>
<div class="equation" id="3.8">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.8} T_k (B) = U_k\Sigma _kV^T_k = \sum \limits ^k_{j = 1} \sigma _j {\boldsymbol {u}}_j{\boldsymbol {v}}^T_j\end{equation}
  </div>
  <span class="equation_label">3.8</span>
</p>
</div>
<p>is called a rank-\(k\) truncated SVD of \(B\). The importance of this matrix lies in the following observations. As before \(\|  \cdot \| \) denote a unitarily invariant norm on \(\mathbb {R}^{m \times n}\), \(A\) is some matrix in \(\mathbb {R}^{m\times n}\), and \(T_k(A)\) denotes a rank-\(k\) truncated SVD of \(A\). Then \(T_k(A)\) solves the least norm problem </p>
<div class="equation" id="3.9">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.9} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(H) = \|  A - H\| \\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {R}^{m\times n} \;  {\hbox{\rm and \  }}{\rm rank}(H) \le k. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">3.9</span>
</p>
</div>
<p> This is the well known Eckart-Young-Mirsky theorem <span class="cite">
	[
	<a href="#14" >14</a>
	, 
	<a href="#26" >26</a>
	]
</span>. For recent discussion of this observation see <span class="cite">
	[
	<a href="#12" >12</a>
	]
</span>. The next theorem sharpens this result by forcing \(A\) and \(H\) to stay in \(\mathbb {B}\). </p>
<p><div class="thm_thmwrapper " id="th.8">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">3.2</span>
  </div>
  <div class="thm_thmcontent">
  <p> Assume that \(B \in \mathbb {B}\). In this case \(T_k(B) \in \mathbb {B}\). Consequently \(T_k(B)\) solves the least norm problem </p>
<div class="equation" id="3.10">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.10} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(H) = \|  B - H\| \\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {B}\;  {\hbox{\rm and \  }}{\rm rank}(H) \le k. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">3.10</span>
</p>
</div>

  </div>
</div> <div class="proof_wrapper" id="a0000000025">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> It is sufficient to show that \(T_k(B) \in \mathbb {B}\). Assume first that \(B \in \mathbb {X}\). Then <a href="#th.7">theorem 3.1</a> says that \(V^T \hat X = O\), which implies \(V^T_k\hat X = O\) and \(T_k(B)\hat X = O\). That is, \(T_k(B) \in \mathbb {X}\). Similar arguments show that \(B \in \mathbb {Y}\) implies \(T_k(B) \in \mathbb {Y}\), and that \(B\in \mathbb {Z}\) implies \(T_k(B) \in \mathbb {Z}\). <div class="proof_wrapper" id="a0000000026">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> Stronger results are obtained when using the Frobenius matrix norm. In this case \(A\) can be any matrix in \(\mathbb {R}^{m\times n}\). </p>
<p><div class="thm_thmwrapper " id="th.9">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">3.3</span>
  </div>
  <div class="thm_thmcontent">
  <p> Let \(A, \mathbb {B}\), and \(\hat B\) be as in <span class="rm"><a href="#lm.5">lemma 2.5</a></span>. Then the matrix \(T_k(\hat B)\) solves the least norm problem </p>
<div class="equation" id="3.11">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.11} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(H) = \|  A - H\| ^2_F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {B}\;  {\hbox{\rm and \  }}{\rm rank}(H) \le k, \end{aligned} \end{equation}
  </div>
  <span class="equation_label">3.11</span>
</p>
</div>
<p> which leads to the following conclusions: </p>

<p><span class="rm">a)</span>  The matrix \(T_k(A\tilde X \tilde X^T)\) solves the problem </p>
<div class="equation" id="3.12">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.12} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(H) = \|  A - H\| ^2_F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {X}\;  {\hbox{\rm and \  }}{\rm rank}(H) \le k. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">3.12</span>
</p>
</div>

<p><span class="rm">b)</span>  The matrix \(T_k(\tilde Y \tilde Y^TA)\) solves the problem </p>
<div class="equation" id="3.13">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.13} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(H) = \|  A - H\| ^2_F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {Y}\;  {\hbox{\rm and \  }}{\rm rank}(H) \le k. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">3.13</span>
</p>
</div>

<p><span class="rm">c)</span>  The matrix \(T_k(\tilde Y \tilde Y^TA \tilde X \tilde X^T)\) solves the problem </p>
<div class="equation" id="3.14">
<p>
  <div class="equation_content">
    \begin{equation} \label{3.14} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(H) = \|  A - H\| ^2_F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {Z}\;  {\hbox{\rm and \  }}{\rm rank}(H) \le k. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">3.14</span>
</p>
</div>

  </div>
</div> <div class="proof_wrapper" id="a0000000027">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> From Corollary <a href="#co.6">2.6</a> we see that problem (<a href="#3.11">3.11</a>) can be replaced with problem (<a href="#3.10">3.10</a>), using \(\hat B \) instead of \(B\).<div class="proof_wrapper" id="a0000000028">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> </p>
<h1 id="a0000000029">4 Symmetric matrices over a subspace</h1>
<p>In this section we turn to consider matrix approximations over the subspace \(\mathbb {S}\), which is defined in (<a href="#1.4">1.4</a>). As before, there is no loss of generality in replacing \(X\) with an \(n\times p\) orthonormal matrix, \(\hat X\), such that \(\hbox{\rm Range}(\hat X) = \hbox{\rm Range}(X)\). This convention enables us to present \(\mathbb {S}\) in the form: </p>
<div class="equation" id="4.1">
<p>
  <div class="equation_content">
    \begin{equation} \label{4.1} \mathbb {S}= \{  \,  S \,  \big| \,  S \in \mathbb {S}^n\;  \,  \; {\hbox{\rm and \  }}\;  S\hat X = 0\} ,\end{equation}
  </div>
  <span class="equation_label">4.1</span>
</p>
</div>
<p>where \(\mathbb {S}^n\) denotes the set of all real symmetric matrices of order \(n\). That is: </p>
<div class="equation" id="4.2">
<p>
  <div class="equation_content">
    \begin{equation} \label{4.2} \mathbb {S}^n = \{  S \,  \big| \,  S \in \mathbb {R}^{n\times n} \;  \,  {\hbox{\rm and \  }}\;  S = S^T\} .\end{equation}
  </div>
  <span class="equation_label">4.2</span>
</p>
</div>
<p>Let the \(n\times (n-p)\) matrix \(\tilde X\) be obtained from \(\hat X\) as in Section 2, satisfying <a href="#2.4" class="eqref">2.4</a>–<a href="#2.7" class="eqref">2.7</a>. Then, as we have seen, \(\mathbb {S}\) has equivalent presentation in the form </p>
<div class="equation" id="4.3">
<p>
  <div class="equation_content">
    \begin{equation} \label{4.3} \mathbb {S}= \{  \tilde X R \tilde X^T\,  \big|\,  R\in \mathbb {S}^{n-p} \} .\end{equation}
  </div>
  <span class="equation_label">4.3</span>
</p>
</div>
<p>Moreover, following the proofs of <a href="#th.4">theorem 2.4</a> and <a href="#lm.5">lemma 2.5</a> we obtain the following results. <div class="thm_thmwrapper " id="th.10">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">4.1</span>
  </div>
  <div class="thm_thmcontent">
  <p> Let \(S\) be some matrix in \(\mathbb {S}^n\) and consider the problem </p>
<div class="equation" id="4.4">
<p>
  <div class="equation_content">
    \begin{equation} \label{4.4} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(H) = \|  S - H\| ^2_F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {S}. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">4.4</span>
</p>
</div>
<p> Then the matrix </p>
<div class="equation" id="4.5">
<p>
  <div class="equation_content">
    \begin{equation} \label{4.5} \hat S = \tilde X\tilde X^TS\tilde X\tilde X^T = (I-\hat X\hat X^T) S (I-\hat X\hat X^T)\end{equation}
  </div>
  <span class="equation_label">4.5</span>
</p>
</div>
<p>solves this problem. In other words, \(\hat S\) is the orthogonal projection <span class="mbox" style="width: ">of \(S\) onto \(\mathbb {S}\).</span> </p>

  </div>
</div> </p>
<p><div class="cor_thmwrapper " id="co.11">
  <div class="cor_thmheading">
    <span class="cor_thmcaption">
    Corollary
    </span>
    <span class="cor_thmlabel">4.2</span>
  </div>
  <div class="cor_thmcontent">
  <p> The matrix \(S - \hat S\) is orthogonal to any matrix \(H\in \mathbb {S}\). That is, </p>
<div class="equation" id="4.6">
<p>
  <div class="equation_content">
    \begin{equation} \label{4.6} \langle S - \hat S, H\rangle = 0 \;  \;  \,  \forall H \in \mathbb {S}.\end{equation}
  </div>
  <span class="equation_label">4.6</span>
</p>
</div>
<p>Consequently the equality </p>
<div class="equation" id="4.7">
<p>
  <div class="equation_content">
    \begin{equation} \label{4.7} \|  S - H \| ^2_F = \|  S - \hat S\| ^2_F + \|  \hat S - H \| ^2_F\end{equation}
  </div>
  <span class="equation_label">4.7</span>
</p>
</div>
<p>holds for any matrix \(H \in \mathbb {S}\). </p>

  </div>
</div> </p>
<p>The last equality enables us to compute low-rank approximations over \(\mathbb {S}\). Following the proofs of ?? we obtain analogous results for \(\mathbb {S}\). </p>
<p><div class="thm_thmwrapper " id="a0000000030">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">4.3</span>
  </div>
  <div class="thm_thmcontent">
  <p>Let \(S\) and \(\hat S\) be as above, and let \(T_k(\hat S)\) be a rank-\(k\) truncated SVD of \(\hat S\). Then \(T_k(\hat S) \in \mathbb {S}\) and this matrix solves the least norm problem </p>
<div class="equation" id="4.8">
<p>
  <div class="equation_content">
    \begin{equation} \label{4.8} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(H) = \|  S - H\| ^2_F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {S}\; \,  {\hbox{\rm and \  }}{\rm rank}(H) \le k. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">4.8</span>
</p>
</div>

  </div>
</div> </p>
<p>The results of this section can be extended by replacing \(S\) with an arbitrary matrix \(A\in \mathbb {R}^{n\times n}\). In this case problem <a href="#4.4" class="eqref">4.4</a> takes the form </p>
<div class="equation" id="4.9">
<p>
  <div class="equation_content">
    \begin{equation} \label{4.9} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(H) = \|  A - H\| ^2_F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {S}, \end{aligned} \end{equation}
  </div>
  <span class="equation_label">4.9</span>
</p>
</div>
<p> where \(A\) is a given matrix in \(\mathbb {R}^{n\times n}\). The solution of the last problem is based on the following well known observations. Recall that \(A\) has a unique cartesian decomposition of the form \(A = S+T\) where \(S = (A+A^T)/2\) is symmetric and \(T = (A - A^T)/2\) is skew-symmetric. It is also easy to verify that the equality </p>
<div class="equation" id="4.10">
<p>
  <div class="equation_content">
    \begin{equation} \label{4.10} \|  S + T\| ^2_F = \|  S \| ^2_F + \|  T \| ^2_F\end{equation}
  </div>
  <span class="equation_label">4.10</span>
</p>
</div>
<p>holds for any sum of a symmetric matrix, \(S\), plus a skew-symmetric matrix, \(T\). Hence for any symmetric matrix, \(H \in \mathbb {S}^n\), we have the equality </p>
<div class="equation" id="4.11">
<p>
  <div class="equation_content">
    \begin{equation} \label{4.11} \|  A - H \| ^2_F = \|  S-H +T\| ^2_F = \|  S - H \| ^2_F + \|  T \| ^2_F.\end{equation}
  </div>
  <span class="equation_label">4.11</span>
</p>
</div>
<p>Therefore a solution for <a href="#4.4" class="eqref">4.4</a> provides a solution for <a href="#4.9" class="eqref">4.9</a>. </p>
<h1 id="a0000000031">5 Low-rank positive approximants</h1>
<p>In this section we derive explicit solutions for problems <a href="#1.9" class="eqref">1.9</a> and <a href="#1.10" class="eqref">1.10</a>. We start by introducing the tools for solving these problems. Let \(\hat S \in \mathbb {S}^n\) be a given symmetric matrix, and let \(r\) denote its rank. Then \(\hat S\) has a “compact" spectral decomposition of the form </p>
<div class="equation" id="5.1">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.1} \hat S = VDV^T,\end{equation}
  </div>
  <span class="equation_label">5.1</span>
</p>
</div>
<p>where \(V \in \mathbb {R}^{n\times r}\) has orthonormal columns, and </p>
<div class="displaymath" id="a0000000032">
  \[  D = \hbox{\rm diag}\{  \lambda _1,\dots ,\lambda _r\}   \]
</div>
<p> is a diagonal matrix. The diagonal entries of \(D\) are the non-zero eigenvalues of \(\hat S.\) It is assumed for simplicity that these eigenvalues are sorted in decreasing order. That is, </p>
<div class="equation" id="5.2">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.2}\lambda _1\ge \lambda _2\ge \dots \ge \lambda _r.\end{equation}
  </div>
  <span class="equation_label">5.2</span>
</p>
</div>
<p>In addition to \(r\) we use a non-negative integer, \(\ell \), that counts the number of positive eigenvalues of \(\hat S\). The definition of \(\ell \) implies that \(0 \le \ell \le r\), and if \(1 \le \ell {\lt} r\) then the non-zero eigenvalues satisfy </p>
<div class="equation" id="5.3">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.3} \lambda _j > 0 \;  \;  \hbox{ for\  } \;  j = 1, \dots ,\ell \;  \,  {\hbox{\rm and \  }}\;  \,  \lambda _j < 0 \;  \;  \;  \;  {\hbox{\rm for}} \;  \;  \;  \;  j = \ell + 1, \dots , r.\end{equation}
  </div>
  <span class="equation_label">5.3</span>
</p>
</div>
<p>Moreover, if \(1 {\lt} \ell {\lt} r \) then <a href="#5.2" class="eqref">5.2</a> and <a href="#5.3" class="eqref">5.3</a> imply </p>
<div class="equation" id="5.4">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.4} \lambda _1 \ge \dots \ge \lambda _\ell > 0 > \lambda _{\ell + 1} \ge \dots \ge \lambda _r.\end{equation}
  </div>
  <span class="equation_label">5.4</span>
</p>
</div>
<p>Recall also that <a href="#5.1" class="eqref">5.1</a> can be rewritten in the form </p>
<div class="equation" id="5.5">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.5} \hat S = \sum \limits ^r_{j = 1} \lambda _j {\boldsymbol {v}}_j{\boldsymbol {v}}^T_j,\end{equation}
  </div>
  <span class="equation_label">5.5</span>
</p>
</div>
<p>where \({\boldsymbol {v}}_1,\dots , {\boldsymbol {v}}_r\), are the columns of \(V\). These notations enable us to split between the “positive" part of \(\hat S\) and its “negative" part. Let the matrix </p>
<div class="equation" id="5.6">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.6} P (\hat S) = \sum \limits ^\ell _{i = 1} \lambda _i {\boldsymbol {v}}_i {\boldsymbol {v}}^T_i\end{equation}
  </div>
  <span class="equation_label">5.6</span>
</p>
</div>
<p>denote the positive definite part of \(\hat S\), and let the matrix </p>
<div class="equation" id="5.7">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.7} N(\hat S) = \sum \limits ^r_{j = \ell + 1} \lambda _j{\boldsymbol {v}}_j{\boldsymbol {v}}^T_j\end{equation}
  </div>
  <span class="equation_label">5.7</span>
</p>
</div>
<p>denote its negative definite part. Then, clearly, </p>
<div class="equation" id="5.8">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.8} \hat S = P(\hat S) + N(\hat S).\end{equation}
  </div>
  <span class="equation_label">5.8</span>
</p>
</div>
<p>This decomposition is sometimes called the Jordan decomposition of \(\hat S\). (If \(\ell = 0\) then \(P (\hat S) = O\) and \(N(\hat S) = \hat S\). Similarly, if \(\ell = r \) then \(P(\hat S) = \hat S\) and \(N(\hat S) = O\).) Another useful matrix operator is </p>
<div class="equation" id="5.9">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.9} P_k (\hat S) = \sum \limits ^\nu _{i = 1} \lambda _i{\boldsymbol {v}}_i{\boldsymbol {v}}^T_i\end{equation}
  </div>
  <span class="equation_label">5.9</span>
</p>
</div>
<p>where \(k\) is a positive integer and \(\nu = \operatorname {min}\{  k, \ell \} \). (If \(\ell = 0\) then \(P_k (\hat S) = O\).) Note also that </p>
<div class="equation" id="5.10">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.10} P_k (\hat S ) = T_k (P(\hat S))\end{equation}
  </div>
  <span class="equation_label">5.10</span>
</p>
</div>
<p>where, as before, \(T_k(\cdot )\) is the rank-\(k\) truncated SVD operator. The importance of \(P(\hat S)\) and \(P_k(\hat S)\) emerges from the following observations. As before, \(\|  \cdot \| \) denotes a unitarily invariant norm on \(\mathbb {R}^{n\times n}\), and \(H\ge 0 \) means that \(H\) is a symmetric positive semidefinite matrix. </p>
<p><div class="thm_thmwrapper " id="th.13">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">5.1</span>
  </div>
  <div class="thm_thmcontent">
  <p> Let \(\hat S\) be some matrix in \(\mathbb {S}^n\). Then \(P(\hat S)\) solves the problem </p>
<div class="equation" id="5.11">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.11} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  \rho (H) = \|  \hat S - H\| \\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \ge 0. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">5.11</span>
</p>
</div>

  </div>
</div> </p>
<p><div class="thm_thmwrapper " id="th.14">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">5.2</span>
  </div>
  <div class="thm_thmcontent">
  <p> Let \(\hat S\) be some matrix in \(\mathbb {S}^n\). Then \(P_k(\hat S)\) solves the problem </p>
<div class="equation" id="5.12">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.12} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  \rho (H) = \| \hat S - H\| \\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \ge 0 \;  \;  \hbox{and} \;  \;  {\rm rank}(H) \le k. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">5.12</span>
</p>
</div>

  </div>
</div> </p>
<p>A matrix that solves <a href="#5.11" class="eqref">5.11</a> is called “<b class="bf">positive approximant</b>", <i class="it">e.g.</i>, <span class="cite">
	[
	<a href="#7" >7</a>
	, 
	<a href="#13" >13</a>
	, 
	<a href="#19" >19</a>
	, 
	<a href="#21" >21</a>
	, 
	<a href="#22" >22</a>
	, 
	<a href="#28" >28</a>
	]
</span>. The current interest in this problem was initiated by Halmos’ paper <span class="cite">
	[
	<a href="#19" >19</a>
	]
</span>, which considers the solution of <a href="#5.11" class="eqref">5.11</a> in the spectral norm. Rogers and Ward <span class="cite">
	[
	<a href="#28" >28</a>
	]
</span> consider the Schatten \(p\)-norm, Ando <span class="cite">
	[
	<a href="#2" >2</a>
	]
</span> considers the trace norm, and Higham <span class="cite">
	[
	<a href="#21" >21</a>
	]
</span> solves <a href="#5.11" class="eqref">5.11</a> in the Frobenius norm. Later the solution was extended to any unitarily invariant norm by Bhatia and Kittaneh <span class="cite">
	[
	<a href="#5" >5</a>
	]
</span>. See <span class="cite">
	[
	<a href="#4" >4</a>
	, 
	p.
	
	277
	]
</span> and <span class="cite">
	[
	<a href="#13" >13</a>
	]
</span> for alternative proofs. Several results on this topic were obtained in the context of linear operators on a Hilbert space, <i class="it">e.g.</i>, <span class="cite">
	[
	<a href="#4" >4</a>
	, 
	<a href="#5" >5</a>
	, 
	<a href="#7" >7</a>
	, 
	<a href="#19" >19</a>
	, 
	<a href="#28" >28</a>
	]
</span>. The term “<b class="bf">low-rank positive approximants</b>" refers to matrices that solve <a href="#5.12" class="eqref">5.12</a>. The last problem was first solved by Mathar <span class="cite">
	[
	<a href="#25" >25</a>
	]
</span> for Schatten-\(p\) norms, and recently by Dax <span class="cite">
	[
	<a href="#13" >13</a>
	]
</span> for every unitarily invariant norm. The next assertion enables us to apply these results for solving <a href="#1.9" class="eqref">1.9</a> and <a href="#1.10" class="eqref">1.10</a>. </p>
<p><div class="thm_thmwrapper " id="th.15">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">5.3</span>
  </div>
  <div class="thm_thmcontent">
  <p> Let the matrix \(\hat S \in \mathbb {S}^n\) have a compact spectral decomposition of the form <a href="#5.1" class="eqref">5.1</a>. If \(\hat S \in \mathbb {S}\) then </p>
<div class="equation" id="5.13">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.13} V^T\hat X = 0.\end{equation}
  </div>
  <span class="equation_label">5.13</span>
</p>
</div>
<p>In other words, let \({\boldsymbol {v}}\) be an eigenvector of \(\hat S\) that corresponds to a non-zero eigenvalue, then \({\boldsymbol {v}}^T\hat X = {\boldsymbol {o}}\). </p>

  </div>
</div> <div class="proof_wrapper" id="a0000000033">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div>Since \(\hat S \in \mathbb {X}\) we have the equality </p>
<div class="equation" id="5.14">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.14} \hat S\hat X = VDV^T\hat X = O.\end{equation}
  </div>
  <span class="equation_label">5.14</span>
</p>
</div>
<p>Let \({\boldsymbol {w}}_j, \;  j = 1, \dots , p\), denote the \(j\)th column of the \(r\times p\) matrix \(W = V^T\hat X\). Then an equivalent way to write <a href="#5.14" class="eqref">5.14</a> is </p>
<div class="equation" id="5.15">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.15} VD{\boldsymbol {w}}_j = {\boldsymbol {o}}\;  \,  \;  \;  \;  {\hbox{\rm for}} \;  \;  \;  \;  \,  j = 1,\dots , p.\end{equation}
  </div>
  <span class="equation_label">5.15</span>
</p>
</div>
<p>Therefore, since the matrix VD has full column rank, </p>
<div class="equation" id="5.16">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.16}{\boldsymbol {w}}_j = {\boldsymbol {o}}\;  \,  \;  \;  \;  {\hbox{\rm for}} \;  \;  \;  \;  \,  j = 1,\dots , p. \end{equation}
  </div>
  <span class="equation_label">5.16</span>
</p>
</div>
<p><a href="#th.15">Theorem 5.3</a> allows the use of ?? on \(\mathbb {S}\). This gives the following stronger results. </p>
<p><div class="thm_thmwrapper " id="th.16">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">5.4</span>
  </div>
  <div class="thm_thmcontent">
  <p> If \(\hat S \in \mathbb {S}\) then \(P(\hat S) \in \mathbb {S}_+\), where \(\mathbb {S}_+\) is defined in <a href="#1.7" class="eqref">1.7</a>. Consequently \(P(\hat S)\) solves the least norm problem </p>
<div class="equation" id="5.17">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.17} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  \rho (H) = \| \hat S - H\| \\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {S}_+. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">5.17</span>
</p>
</div>

  </div>
</div> \(\square \) </p>
<p><div class="thm_thmwrapper " id="th.17">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">5.5</span>
  </div>
  <div class="thm_thmcontent">
  <p> If \(\hat S \in \mathbb {S}\) then \(P_k (\hat S) \in \mathbb {S}_+\). Consequently \(P_k(\hat S)\) solves the least norm problem </p>
<div class="equation" id="5.18">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.18} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  \rho (H) = \| \hat S - H\| \\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {S}_+ \;  \,  {\hbox{\rm and \  }}\;  \,  {\rm rank}(H) \le k. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">5.18</span>
</p>
</div>

  </div>
</div> </p>
<p>When using the Frobenius norm it is possible to extend these results to any matrix \(A \in \mathbb {R}^{n\times n}\). </p>
<p><div class="thm_thmwrapper " id="th.18">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">5.6</span>
  </div>
  <div class="thm_thmcontent">
  <p> Let \(A\) be some matrix in \(\mathbb {R}^{n \times n}\). Define \(S = (A + A^T)/2\) and let \(\hat S\) be obtained from \(S\) by the rule <a href="#4.5" class="eqref">4.5</a>. Then \(P (\hat S)\) solves the problem </p>
<div class="equation" id="5.19">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.19} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(H) = \|  A - H\| ^2_F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {S}_+, \end{aligned} \end{equation}
  </div>
  <span class="equation_label">5.19</span>
</p>
</div>
<p> while \(P_k(\hat S)\) solves the problem </p>
<div class="equation" id="5.20">
<p>
  <div class="equation_content">
    \begin{equation} \label{5.20} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(H) = \| A - H\| ^2_F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  H \in \mathbb {S}_+ \;  \,  {\hbox{\rm and \  }}\;  \,  {\rm rank}(H) \le k. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">5.20</span>
</p>
</div>

  </div>
</div> <div class="proof_wrapper" id="a0000000034">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> The proof is concluded by combining ?? with equalities <a href="#4.7" class="eqref">4.7</a> and <a href="#4.11" class="eqref">4.11</a>. <div class="proof_wrapper" id="a0000000035">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> </p>
<h1 id="a0000000036">6 Polar cones and polar decompositions</h1>
<p>Another feature that distinguishes \(\mathbb {S}\) is that the subsets \(\mathbb {S}_+\) and \(\mathbb {S}_-\) constitute a polar decomposition of \(\mathbb {S}\). To clarify this statement we give a brief overview of the necessary background. </p>
<p>Let \(\mathbb {H}\) be a real Hilbert space with a scalar product, \(\langle {\boldsymbol {u}}, {\boldsymbol {v}}\rangle \), and related norm \(\|  {\boldsymbol {u}}\|  = (\langle {\boldsymbol {u}}, {\boldsymbol {u}}\rangle )^{1/2}\). Recall that a subset \(\mathbb {K}\) of \(\mathbb {H}\) is called “convex cone" if it has the following property: Let \({\boldsymbol {u}}\) and \({\boldsymbol {v}}\) be two points in \(\mathbb {K}\) and let \(\alpha \) and \(\beta \) be two nonnegative scalars. Then the point \(\alpha {\boldsymbol {u}}+ \beta {\boldsymbol {v}}\) belongs to \(\mathbb {K}\). Given a convex cone, \(\mathbb {K}\), the set </p>
<div class="equation" id="6.1">
<p>
  <div class="equation_content">
    \begin{equation} \label{6.1} \mathbb {K}^* = \{  {\boldsymbol {u}}\,  \big| \,  {\boldsymbol {u}}\in \mathbb {H}\;  \, \;  {\hbox{\rm and \  }}\;  \,  \langle {\boldsymbol {u}}, {\boldsymbol {v}}\rangle \le 0\;  \;  \,  \forall {\boldsymbol {v}}\in \mathbb {K}\} \end{equation}
  </div>
  <span class="equation_label">6.1</span>
</p>
</div>
<p>is called the polar cone of \(\mathbb {K}\). It is well known that \(\mathbb {K}^*\) is a closed convex cone, and that \((\mathbb {K}^*)^*\) is the closure of \(\mathbb {K}\), <i class="it">e.g.</i>, <span class="cite">
	[
	<a href="#3" >3</a>
	]
</span>. Thus, if \(\mathbb {K}\) is a closed convex cone then \((\mathbb {K}^*)^* = \mathbb {K}\). An important feature that characterizes polar cones is <b class="bf">Moreau’s Polar Decomposition</b> <span class="cite">
	[
	<a href="#27" >27</a>
	]
</span>: Let \(\mathbb {K}\) be a closed convex cone in \(\mathbb {H}\) and let \(\mathbb {K}^*\) be its polar cone. Then any vector \({\boldsymbol {w}}\in \mathbb {H}\) has a unique polar decomposition of the form </p>
<div class="equation" id="6.2">
<p>
  <div class="equation_content">
    \begin{equation} \label{6.2} {\boldsymbol {w}}= {\boldsymbol {u}}+ {\boldsymbol {v}},\;  \;  {\boldsymbol {u}}\in \mathbb {K}, \;  \;  {\boldsymbol {v}}\in \mathbb {K}^*, \;  \;  {\hbox{\rm and \  }}\;  \langle {\boldsymbol {u}}, {\boldsymbol {v}}\rangle = 0.\end{equation}
  </div>
  <span class="equation_label">6.2</span>
</p>
</div>
<p>Moreover, \({\boldsymbol {u}}\) is the projection of \({\boldsymbol {w}}\) onto \(\mathbb {K}\), and \({\boldsymbol {v}}\) is the projection of \({\boldsymbol {w}}\) onto \(\mathbb {K}^*\). (The term “projection" means that these vectors solve the related least norm problems.) The above relations are often summarized by saying that \(\mathbb {K}\) and \(\mathbb {K}^*\) constitute a polar decomposition on \(\mathbb {H}\). </p>
<p>Below we will show that \((\mathbb {S}_+)^* = \mathbb {S}_-\) and \((\mathbb {S}_-)^* = \mathbb {S}_+\). For this purpose we prove the following lemma. </p>
<p><div class="lem_thmwrapper " id="lm.19">
  <div class="lem_thmheading">
    <span class="lem_thmcaption">
    Lemma
    </span>
    <span class="lem_thmlabel">6.1</span>
  </div>
  <div class="lem_thmcontent">
  <p> Let \(\mathbb {H}, \mathbb {K}\) and \(\mathbb {K}^*\) be as above, and let the convex cone \(\tilde\mathbb {K}\) be contained in \(\mathbb {K}^*\). That is, \(\langle {\boldsymbol {u}}, {\boldsymbol {v}}\rangle \le 0\) whenever \({\boldsymbol {u}}\in \mathbb {K}\) and \({\boldsymbol {v}}\in \tilde\mathbb {K}\). If any vector \({\boldsymbol {w}}\in \mathbb {H}\) satisfies </p>
<div class="equation" id="6.3">
<p>
  <div class="equation_content">
    \begin{equation} \label{6.3} {\boldsymbol {w}}= {\boldsymbol {u}}+ {\boldsymbol {v}}, \;  \;  {\boldsymbol {u}}\in \mathbb {K}, \;  \;  {\boldsymbol {v}}\in \tilde\mathbb {K}, \;  \;  {\hbox{\rm and \  }}\;  \,  \langle {\boldsymbol {u}}, {\boldsymbol {v}}\rangle = 0,\end{equation}
  </div>
  <span class="equation_label">6.3</span>
</p>
</div>
<p>then \(\tilde\mathbb {K}=\mathbb {K}^*\). </p>

  </div>
</div> <div class="proof_wrapper" id="a0000000037">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> The proof is by contradiction. Assume for a moment the existence of a vector \({\boldsymbol {w}}\in \mathbb {K}^*\) such that \({\boldsymbol {w}}\notin \tilde\mathbb {K}\). In this case \({\boldsymbol {w}}\) satisfies <a href="#6.3" class="eqref">6.3</a> with \({\boldsymbol {u}}\neq 0\). This implies \(\langle {\boldsymbol {u}}, {\boldsymbol {w}}\rangle = \|  {\boldsymbol {u}}\| ^2 {\gt} 0\), which contradicts the assumption that \({\boldsymbol {w}}\in \mathbb {K}^*\). <div class="proof_wrapper" id="a0000000038">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> </p>
<p>Let us return now to consider the sets \(\mathbb {S}_+ \) and \(\mathbb {S}_-\). It is easy to verify that these sets are convex cones in \(\mathbb {S}\). Moreover, as we now show, these cones constitute a polar decomposition of \(\mathbb {S}\). The tools for proving this observation are the matrix operators \(P(\hat S)\) and \(N(\hat S)\) which were introduced in <a href="#5.6" class="eqref">5.6</a> and <a href="#5.7" class="eqref">5.7</a>. </p>
<p><div class="thm_thmwrapper " id="a0000000039">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">6.2</span>
  </div>
  <div class="thm_thmcontent">
  <p>Every matrix \(\hat S \in \mathbb {S}\) has a unique polar decomposition of the form: </p>
<div class="equation" id="6.4">
<p>
  <div class="equation_content">
    \begin{equation} \label{6.4} \hat S = P (\hat S) + N (\hat S),\;  \;  P(\hat S) \in \mathbb {S}_+, \;  \;  N(\hat S) \in \mathbb {S}_-\end{equation}
  </div>
  <span class="equation_label">6.4</span>
</p>
</div>
<p>and </p>
<div class="equation" id="6.5">
<p>
  <div class="equation_content">
    \begin{equation} \label{6.5} \langle P(\hat S), N(\hat S) \rangle = 0.\end{equation}
  </div>
  <span class="equation_label">6.5</span>
</p>
</div>
<p>Furthermore, \(\mathbb {S}_-\) is the polar cone of \(\mathbb {S}_+\) and \(\mathbb {S}_+\) is the polar cone of \(\mathbb {S}_-\). </p>

  </div>
</div> <div class="proof_wrapper" id="a0000000040">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> We have already seen that <a href="#6.4" class="eqref">6.4</a> holds. The proof of (<a href="#6.5">6.5</a>) is based on the following equalities: </p>
<div class="displaymath" id="6.6">
  \begin{align} \label{6.6} \langle P(\hat S), N(\hat S) \rangle & = \Big\langle \sum \limits ^\ell _{i = 1} \lambda _i{\boldsymbol {v}}_i{\boldsymbol {v}}^T_i,\;  \;  \sum \limits ^r_{j = \ell + 1} \lambda _j{\boldsymbol {v}}_j{\boldsymbol {v}}_j^T\Big\rangle \nonumber \\ & = \sum \limits ^\ell _{i = 1} \sum \limits ^r_{j = \ell + 1} \langle \lambda _i{\boldsymbol {v}}_i {\boldsymbol {v}}^T_i,\;  \;  \lambda _j{\boldsymbol {v}}_j{\boldsymbol {v}}^T_j\rangle \nonumber \\ & = \sum \limits ^\ell _{i = 1} \sum \limits ^r_{j = \ell + 1} \lambda _i\lambda _j ({\boldsymbol {v}}^T_i {\boldsymbol {v}}_j)^2 = 0, \end{align}
</div>
<p> where the last equality holds since \({\boldsymbol {v}}^T_i{\boldsymbol {v}}_j = 0\) whenever \(\lambda _i \neq \lambda _j\). </p>
<p>Next we show that \(\mathbb {S}_-\) is contained in \((\mathbb {S}_+)^*\), and that \(\mathbb {S}_+\) is contained in \((\mathbb {S}_-)^*\). Let \(H_+\) be any matrix in \(\mathbb {S}_+\), and let \(H_-\) be any matrix in \(\mathbb {S}_-\). Then it is sufficient to show that </p>
<div class="equation" id="6.7">
<p>
  <div class="equation_content">
    \begin{equation} \label{6.7} \langle H_+, H_- \rangle \le 0.\end{equation}
  </div>
  <span class="equation_label">6.7</span>
</p>
</div>
<p>Let \(\eta _1, \dots , \eta _t\), denote the nonzero eigenvalues of \(H_-\), and let \({\boldsymbol {h}}_1, \dots , {\boldsymbol {h}}_t\), denote the corresponding eigenvectors of \(H_-\). That is: </p>
<div class="equation" id="6.8">
<p>
  <div class="equation_content">
    \begin{equation} \label{6.8} H_- = \sum \limits ^t_{j = 1} \eta _j {\boldsymbol {h}}_j {\boldsymbol {h}}^T_j\end{equation}
  </div>
  <span class="equation_label">6.8</span>
</p>
</div>
<p>where </p>
<div class="equation" id="6.9">
<p>
  <div class="equation_content">
    \begin{equation} \label{6.9} \eta _j < 0 \;  \;  \hbox{for\  } j = 1,\dots , t.\end{equation}
  </div>
  <span class="equation_label">6.9</span>
</p>
</div>
<p>From this presentation we see that: </p>
<div class="displaymath" id="6.10">
  \begin{align} \label{6.10} \langle H_+, H_-\rangle = & \langle H_+, \sum \limits ^t_{j = 1} \eta _j{\boldsymbol {h}}_j{\boldsymbol {h}}^T_j \rangle \nonumber \\ = & \sum \limits ^t_{j=1} \eta _j \langle H_+, {\boldsymbol {h}}_j{\boldsymbol {h}}^T_j\rangle = \sum \limits ^t_{j = 1} \eta _j {\boldsymbol {h}}^T_j H_+ {\boldsymbol {h}}_j \le 0, \end{align}
</div>
<p> where the last inequality is concluded from <a href="#6.9" class="eqref">6.9</a> and the fact that \(H_+\) is a positive semidefinite matrix. Now <a href="#lm.19">lemma 6.1</a> implies that \(\mathbb {S}_-\) is the polar cone of \(\mathbb {S}_+\), and that \(\mathbb {S}_+ \) is the polar cone of \(\mathbb {S}_-\). <div class="proof_wrapper" id="a0000000041">
  <div class="proof_heading">
    <span class="proof_caption">
    Proof
    </span>
    <span class="expand-proof">â–¼</span>
  </div>
  <div class="proof_content">
  
  </div>
</div> </p>
<h1 id="a0000000042">7 Principal Component Analysis (PCA) and the nearest rank-\({\protect \lowercase {k}}\) centered matrix</h1>
<p>Let \(A \in \mathbb {R}^{m\times n}, \;  m \ge n\), be a given data matrix. In this section we derive new observations about the principal component analysis (PCA) of \(A\). The first step in constructing the PCA is to “center" the columns of \(A\). Hence we shall start by explaining this concept and showing how to compute a rank-\(k\) centered matrix that is nearest to \(A\). </p>
<p>A matrix \(B \in \mathbb {R}^{m\times n}\) is called <b class="bf">row-centered</b> if \(B\hat{{\boldsymbol {e}}} = {\boldsymbol {o}}\) where \(\hat{\boldsymbol {e}}= (1,\dots ,1)^T \) \(\in \mathbb {R}^n\). Similarly \(B\) is called <b class="bf">column-centered</b> if \(\tilde{\boldsymbol {e}}^T B = {\boldsymbol {o}}\) where \(\tilde{\boldsymbol {e}}= (1,1,\dots , 1)^T \in \mathbb {R}^m\). If \(B\) satisfies both \(B \hat{\boldsymbol {e}}= {\boldsymbol {o}}\) and \(\tilde{\boldsymbol {e}}^T B = 0\) then it is called <b class="bf">doubly-centered</b>. Observe that the corresponding canonical subspaces have the following forms. </p>
<div class="equation" id="7.1">
<p>
  <div class="equation_content">
    \begin{equation} \label{7.1} \mathbb {X}= \{  B\,  | \,  B \in \mathbb {R}^{m\times n} \;  \;  \;  \;  {\hbox{\rm and}} \;  \;  \; B \hat{\boldsymbol {e}}= {\boldsymbol {o}}\} ,\end{equation}
  </div>
  <span class="equation_label">7.1</span>
</p>
</div>
<div class="equation" id="7.2">
<p>
  <div class="equation_content">
    \begin{equation} \label{7.2} \mathbb {Y}= \{  B\,  | \,  B \in \mathbb {R}^{m\times n} \;  \;  \;  \;  {\hbox{\rm and}} \;  \;  \; \tilde{\boldsymbol {e}}^T B = {\boldsymbol {o}}\} ,\end{equation}
  </div>
  <span class="equation_label">7.2</span>
</p>
</div>
<div class="equation" id="7.3">
<p>
  <div class="equation_content">
    \begin{equation} \label{7.3} \mathbb {Z}= \{  B\,  | \,  B \in \mathbb {R}^{m\times n}, \;  B \hat{\boldsymbol {e}}= {\boldsymbol {o}}\;  \;  \;  {\hbox{\rm and}} \;  \;  \; \tilde{\boldsymbol {e}}^T B = {\boldsymbol {o}}\} .\end{equation}
  </div>
  <span class="equation_label">7.3</span>
</p>
</div>
<p>Now from <a href="#th.7">theorem 3.1</a> we obtain the following somewhat surprising results. </p>
<p><div class="thm_thmwrapper " id="th.21">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">7.1</span>
  </div>
  <div class="thm_thmcontent">
  <p> Let the matrix \(B \in \mathbb {R}^{m\times n}\) have the compact SVD <a href="#3.2" class="eqref">3.2</a>–<a href="#3.4" class="eqref">3.4</a>. </p>
<ul class="itemize">
  <li><p>If \(B \in \mathbb {X}\) then \(V^T \hat{\boldsymbol {e}}= {\boldsymbol {o}}\). That is, the right singular vectors are centered. </p>
</li>
  <li><p>If \(B \in \mathbb {Y}\) then \(\tilde{\boldsymbol {e}}^T U = {\boldsymbol {o}}\). That is, the left singular vectors are centered. </p>
</li>
  <li><p>If \(B \in \mathbb {Z}\) then \(\tilde{\boldsymbol {e}}^T U = {\boldsymbol {o}}\) and \(V^T\hat{\boldsymbol {e}}= {\boldsymbol {o}}\). In other words, in this case both the left singular vectors and the right singular are centered. </p>
</li>
</ul>

  </div>
</div> </p>
<p>The centering of a matrix, or vector is done by applying the following matrices. </p>
<div class="equation" id="7.4">
<p>
  <div class="equation_content">
    \begin{equation} \label{7.4} \tilde C = I - \tilde{\boldsymbol {e}}\tilde{\boldsymbol {e}}^T/ m \,  \in \mathbb {R}^{m\times m} \;  \;  \;  {\hbox{\rm and}} \;  \;  \; \hat C = I - \hat{\boldsymbol {e}}\hat{\boldsymbol {e}}^T/ n.\end{equation}
  </div>
  <span class="equation_label">7.4</span>
</p>
</div>
<p>With these notations at hand it is easy to verify that the matrix \(\tilde C A\) is column-centered. That is, the mean of each column equals zero. Similarly, the matrix \(A\hat C\) is row-centered and the mean of each row equals zero. Now from ?? we obtain the following conclusions. </p>
<p><div class="thm_thmwrapper " id="th.22">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">7.2</span>
  </div>
  <div class="thm_thmcontent">
  <p> The matrix \(A\hat C\) is an orthogonal projection of \(A\) on \(\mathbb {X}\). The matrix \(\tilde CA\) is an orthogonal projection of \(A\) on \(\mathbb {Y}\). The matrix \(\tilde C A \hat C\) is an orthogonal projection of \(A\) on \(\mathbb {Z}\). Moreover, let \(B\) be some matrix in \(\mathbb {Y}\), then equality <a href="#2.26" class="eqref">2.26</a> gives </p>
<div class="equation" id="7.5">
<p>
  <div class="equation_content">
    \begin{equation} \label{7.5} \|  A - B \| ^2_F = \| A - \tilde C A\| ^2_F + \|  \tilde C A - B \| ^2_F.\end{equation}
  </div>
  <span class="equation_label">7.5</span>
</p>
</div>
<p>(Similar equalities hold in \(\mathbb {X}\) and \(\mathbb {Z}\).) </p>

  </div>
</div> </p>
<p>The norm equalities that we obtained enable us to derive the following conclusions. </p>
<p><div class="thm_thmwrapper " id="th.23">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">7.3</span>
    <span class="thm_thmtitle"><b class="bf">The nearest rank-\(k\) centered matrix</b></span>
  </div>
  <div class="thm_thmcontent">
  <p>  Let \(T_k(B)\) denote the truncated SVD operator <a href="#3.8" class="eqref">3.8</a>. </p>
<ul class="itemize">
  <li><p>The matrix \(T_k(A \hat{C})\) solves <a href="#3.12" class="eqref">3.12</a> and the right singular vectors of this matrix are centered. </p>
</li>
  <li><p>The matrix \(T_k(\tilde CA)\) solves <a href="#3.13" class="eqref">3.13</a> and the left singular vectors of this matrix are centered. </p>
</li>
  <li><p>The matrix \(T_k(\tilde CA\hat C)\) solves <a href="#3.14" class="eqref">3.14</a> and the singular vectors of this matrix are centered. </p>
</li>
</ul>

  </div>
</div> </p>
<p>Let us turn now to inspect the effect of these results on the PCA of A. For detailed description of the PCA and its properties, see <span class="cite">
	[
	<a href="#1" >1</a>
	]
</span> and the references therein. As noted in <span class="cite">
	[
	<a href="#1" >1</a>
	]
</span>, usually the data matrix, \(A\), is pre-processed before the analysis. This is done by centering the columns of \(A\). That is, \(A\) is replaced by \(\tilde CA\). Then the SVD of \(\tilde C A\) is computed. In other words, the PCA of A is the SVD of \(\tilde CA\). Now ?? lead to the following conclusions. </p>
<p><div class="thm_thmwrapper " id="th.24">
  <div class="thm_thmheading">
    <span class="thm_thmcaption">
    Theorem
    </span>
    <span class="thm_thmlabel">7.4</span>
  </div>
  <div class="thm_thmcontent">
  <p> The matrix \(\tilde CA\) is the orthogonal projection of \(A\) on the subspace </p>
<div class="equation" id="7.6">
<p>
  <div class="equation_content">
    \begin{equation} \label{7.6} \mathbb {X}= \{  B \,  | \,  B \in \mathbb {R}^{m\times n} \;  \;  \;  {\hbox{\rm and}} \;  \;  \; \tilde{\boldsymbol {e}}^T B = {\boldsymbol {o}}\} .\end{equation}
  </div>
  <span class="equation_label">7.6</span>
</p>
</div>
<p>Moreover, let \(r\) denote the rank of \(\tilde CA\). Then for each \(k,\;  k = 1, \dots , r\), the matrix \(T_k(\tilde C A)\) solves the problem </p>
<div class="equation" id="7.7">
<p>
  <div class="equation_content">
    \begin{equation} \label{7.7} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  f(B) = \|  A - B\| _F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  B \in \mathbb {X}\;  \;  \;  {\hbox{\rm and}} \;  \;  \; {\rm rank}(B) \le k, \end{aligned} \end{equation}
  </div>
  <span class="equation_label">7.7</span>
</p>
</div>
<p> and the left singular vectors of \(T_k (\tilde CA) \) are centered. </p>

  </div>
</div> </p>
<p>The last theorem brings two innovations about the PCA. It is well known that \(T_k (\tilde C A)\) is a rank-\(k\) matrix that is nearest to \(\tilde CA\) in any unitarily invariant norm. The results of <a href="#th.24">theorem 7.4</a> extend this observation in the following way. Here \(T_k(\tilde C A)\) is nearest to \(A\) when using the Frobenius matrix norm. The second innovation is the fact that the left singular vectors of the matrices \(T_k(\tilde CA)\) are centered. </p>
<p>Another way to carry out PCA is called <b class="bf">covariance PCA</b>. In this case one computes the SVD of the matrix \(\tilde CA/\sqrt{m}\) (or \(\tilde CA/\sqrt{m-1}\)). The name comes from the fact that the matrix \((\tilde CA/\sqrt{m})^T(\tilde CA/\sqrt{m})\) is a covariance matrix. Note that <a href="#th.24">theorem 7.4</a> is easily modified to include this case. </p>
<p>A third way to compute PCA is called <b class="bf">correlation PCA</b>. In this version the columns of the centered matrix \(\tilde CA\) are normalized to have unit length. Let the columns of \(\tilde CA\) be denoted as \({\hbox{\bf c}}_j,\;  j = 1,\dots , n\), and let the diagonal matrix </p>
<div class="displaymath" id="a0000000043">
  \[  D = \hbox{\rm diag}\{  d_1, \dots , d_n\}  \]
</div>
<p> be defined by the equalities \(d_j = \|  {\hbox{\bf c}}_j\| _2, \;  j = 1,\dots , n\). Then here one computes the SVD of the matrix \(\tilde C A D^{-1}\). The name comes from the fact that \((\tilde C A D^{-1})^T (\tilde C A D^{-1})\) is a correlation matrix. As before, since \(\tilde C A D^{-1}\) is a column-centered matrix, the left singular vectors of this matrix are also centered. </p>
<h1 id="a0000000044">8 Euclidean Distance matrices: The nearest rank-\({\protect \lowercase {k}}\) source matrix</h1>
<p>In this section we briefly describe an application that arises in Euclidean Distance (ED) matrices. A matrix \(A = (a_{ij}) \in \mathbb {R}^{n\times n}\) is said to be a \(k\)-dimensional Euclidean distance (ED) matrix if there exist \(n\) points in \(\mathbb {R}^k\), say \({\boldsymbol {x}}_1, {\boldsymbol {x}}_2, \dots , {\boldsymbol {x}}_n\), such that </p>
<div class="equation" id="8.1">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.1} a_{ij} = \|  {\boldsymbol {x}}_i - {\boldsymbol {x}}_j\| ^2_2, \;  \;  \,  i = 1, \dots , n, \;  \;  j = 1, \dots , n.\end{equation}
  </div>
  <span class="equation_label">8.1</span>
</p>
</div>
<p>Let \(X = [{\boldsymbol {x}}_1,\dots , {\boldsymbol {x}}_n] \in \mathbb {R}^{k \times n}\) denote the related <b class="bf">positions matrix</b>, and let the column vector \({\boldsymbol {g}}\in \mathbb {R}^n\) be obtained from the diagonal entries of the matrix \(X^T X\). That is, </p>
<div class="equation" id="8.2">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.2} {\boldsymbol {g}}= (\|  {\boldsymbol {x}}_1\| ^2_2,\dots , \|  {\boldsymbol {x}}_n\| ^2_2)^T.\end{equation}
  </div>
  <span class="equation_label">8.2</span>
</p>
</div>
<p>Let the matrix operator \(G(X)\) be defined as </p>
<div class="equation" id="8.3">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.3} G(X) = {\boldsymbol {g}}{\boldsymbol {e}}^T + {\boldsymbol {e}}{\boldsymbol {g}}^T - 2X^T X,\end{equation}
  </div>
  <span class="equation_label">8.3</span>
</p>
</div>
<p>where </p>
<div class="displaymath" id="a0000000045">
  \[  {\boldsymbol {e}}= (1, 1, \dots , 1)^T \in \mathbb {R}^n. \]
</div>
<p> Then <a href="#8.1" class="eqref">8.1</a> can be rewritten as </p>
<div class="equation" id="8.4">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.4} A = G(X). \end{equation}
  </div>
  <span class="equation_label">8.4</span>
</p>
</div>
<p>Note that for any orthonormal matrix \(Q \in \mathbb {R}^{k\times k}\), the matrices \(X\) and \(QX\) generate the same ED matrix, and \(\|  X \| _F = \|  Q X\| _F\). The Frobenius norm of the positions matrix can be reduced by considering the least squares problem </p>
<div class="equation" id="8.5">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.5} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  \eta ({\boldsymbol {u}}) = \sum \limits ^n_{j = 1} \|  {\boldsymbol {x}}_j - {\boldsymbol {u}}\| ^2_2\\ & {\hbox{\rm subject to \  }}\;  \;  \;  {\boldsymbol {u}}\in \mathbb {R}^n, \end{aligned} \end{equation}
  </div>
  <span class="equation_label">8.5</span>
</p>
</div>
<p> which has a unique minimizer at the centroid point </p>
<div class="displaymath" id="a0000000046">
  \[ \hat{\boldsymbol {u}}= ({\boldsymbol {x}}_1 + \dots + {\boldsymbol {x}}_n)/n. \]
</div>
<p> The replacement of \({\boldsymbol {x}}_j\) with \({\boldsymbol {x}}_j - \hat{\boldsymbol {u}}\) is carried out by introducing the centering matrix </p>
<div class="equation" id="8.6">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.6} C = I - {\boldsymbol {e}}{\boldsymbol {e}}^T/n,\end{equation}
  </div>
  <span class="equation_label">8.6</span>
</p>
</div>
<p>which shifts the origin of \(\mathbb {R}^k\) to the centroid point. The matrix \(XC\) is row-centered and \(\|  XC \| ^2_F \le \|  X\| ^2_F\). Furthermore, since shifting the origin does not change Euclidean distances, the matrices \(X\) and \(XC\) generate the same ED matrix. Hence there is no loss of generality in assuming that the position matrix is row-centered. That is, </p>
<div class="equation" id="8.7">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.7} X {\boldsymbol {e}}= {\boldsymbol {o}}\; \;  \,  {\hbox{\rm and \  }}\;  XC = X.\end{equation}
  </div>
  <span class="equation_label">8.7</span>
</p>
</div>
<p>Note also that any other matrix, \(\hat X\), that satisfies \(\hat X^T \hat X = X^TX\) generates the same ED matrix as \(X\). </p>
<p>The use of ED matrices occurs in various applications. One example comes from multidimensional scaling problems in psychometrics and statistics. Here the matrix entries represent similarities (or dissimilarities) between objects and we want to produce geometric representation of these objects, <i class="it">e.g.</i>, <span class="cite">
	[
	<a href="#6" >6</a>
	, 
	<a href="#9" >9</a>
	]
</span>. A second example comes from computational chemistry, in which it is desired to determine the structure of a molecule (“molecule conformation") from information about interatomic distances, <i class="it">e.g.</i>, <span class="cite">
	[
	<a href="#8" >8</a>
	, 
	<a href="#10" >10</a>
	, 
	<a href="#11" >11</a>
	, 
	<a href="#17" >17</a>
	, 
	<a href="#18" >18</a>
	]
</span>. Other important applications arise in the fields of sensor network localization and distance geometry, <i class="it">e.g.</i>, <span class="cite">
	[
	<a href="#11" >11</a>
	, 
	<a href="#23" >23</a>
	, 
	<a href="#24" >24</a>
	]
</span>. In these applications the ED matrix, \(A\), is obtained from empirical observations, but the position matrix is not known. Hence it is desired to compute a position matrix, \(X\), that “fits" the observed ED matrix. Note also that in many cases \(k\) is known in advance. </p>
<p>If \(A\) is an “exact" \(k\)-dimensional ED matrix then \(X\) can be obtained in the following way. From <a href="#8.3" class="eqref">8.3</a> we see that: </p>
<div class="equation" id="8.8">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.8} C(-A/2)C = (XC)^T(XC),\end{equation}
  </div>
  <span class="equation_label">8.8</span>
</p>
</div>
<p>which means that the symmetric matrix: </p>
<div class="equation" id="8.9">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.9} S = C(-A/2)C\end{equation}
  </div>
  <span class="equation_label">8.9</span>
</p>
</div>
<p>is positive semidefinite, and rank\((S)\le k\). Consequently it is possible to compute a square root of \(S\), say \(X\), such that: </p>
<div class="equation" id="8.10">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.10} S = X^T X,\end{equation}
  </div>
  <span class="equation_label">8.10</span>
</p>
</div>
<p>\(X \in \mathbb {R}^{k\times n}\), and \(X = XC\). That is, \(X\) is a centered position matrix of \(A\). </p>
<p>The properties of \(S\) can be summarized by saying that it is the source matrix of \(A\). More precisely, a matrix \(S \) that satisfies: </p>
<div class="equation" id="8.11">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.11} S=S^T,\;  \;  S\ge 0, \;  \;  S = CSC, \; \;  \hbox{and} \;  \,  \;  {\rm rank}(S) \le k,\end{equation}
  </div>
  <span class="equation_label">8.11</span>
</p>
</div>
<p>is called a rank-\(k\) <b class="bf">source matrix</b>. As we have seen, any matrix of this kind, \(S\), can be used to generate a \(k\times n\) centered positions matrix, \(X\), and a related ED matrix, \(A = G(X)\), such that: </p>
<div class="equation" id="8.12">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.12} C(-A/2) C = S = X^TX.\end{equation}
  </div>
  <span class="equation_label">8.12</span>
</p>
</div>
<p>Recall, however, that in practice the entries of \(A\) are obtained from empirical observations, such as physical measurements. Consequently these entries may contain some error, and \(A\) may differ from an “exact" ED matrix. In this case the matrix \(C (-A/2)C\) may fail to be a rank-\(k\) source matrix, and the recovering of \(X\) needs some amendments. </p>
<p>The problem that we have to solve is, therefore, the following. Given an erroneous ED matrix, \(A\), compute a positions matrix, \(X\), that “fits" \(A\) in a “reasonable" way. Usually the solution process starts by replacing \(A\) with the nearest predistance matrix, \(\hat A\). Recall that a <b class="bf">predistance matrix</b> is a hollow symmetric matrix with nonnegative entries. The term “hollow" refers to matrices with zero diagonal entries. The converting of \(A\) into a predistance matrix starts by setting to zero all the negative entries and all the diagonal entries. Then \(A\) is replaced by \((A^T + A)/2\). Now it is easy to verify that the resulting predistance matrix, \(\hat A\), is the nearest to \(A\) with respect to Frobenius norm. Yet it is still possible that \(\hat A\) is not an ED matrix, which means that the matrix \(C(-\hat A/2)C\) is not a rank-\(k\) source matrix. </p>
<p>A common way to compute a positions matrix \(X\) that “fits" \(\hat A\) is by solving the problem </p>
<div class="equation" id="8.13">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.13} \begin{aligned} & {\rm minimize}\;  \;  \;  \;  \;  \;  \;  F(X) = \| \hat A - G(X)\| ^2_F\\ & {\hbox{\rm subject to \  }}\;  \;  \;  X \in \mathbb {R}^{k \times n}. \end{aligned} \end{equation}
  </div>
  <span class="equation_label">8.13</span>
</p>
</div>
<p> The last problem has no explicit solution, and usually it is solved by applying some iterative method, <i class="it">e.g.</i>, <span class="cite">
	[
	<a href="#8" >8</a>
	, 
	<a href="#17" >17</a>
	, 
	<a href="#24" >24</a>
	]
</span>. In this section we propose a different approach. Let \(S\) be a rank-\(k\) source matrix that is nearest to \(C(-\hat A/2)C\) with respect to Frobenius norm. Then \(X\) is defined to be the square roof of \(S\). That is, a matrix that satisfies \(X^T X = S\). The derivation of \(S\) and the motivation behind the proposed solution are explained below. </p>
<p>Let \(S \in \mathbb {S}^n\) be a given symmetric matrix. Than it is easy to verify that the following three equalities, \(S = CSC\), \(S{\boldsymbol {e}}= {\boldsymbol {o}}\), and \({\boldsymbol {e}}^T S = {\boldsymbol {o}}^T\), are equivalent in the sense that one equality implies the others. These relations show that the subspace </p>
<div class="displaymath" id="a0000000047">
  \[  \mathbb {S}_c = \{  S \,  | \,  S \in \mathbb {S}^n \;  {\hbox{\rm and \  }}\;  S = CSC\}  \]
</div>
<p> equals the canonical subspace </p>
<div class="displaymath" id="a0000000048">
  \[  \mathbb {S}_{\boldsymbol {e}}= \{  S\,  | \,  S \in \mathbb {S}^n\;  \,  {\hbox{\rm and \  }}\;  S{\boldsymbol {e}}= {\boldsymbol {o}}\} . \]
</div>
<p> Note also that the matrix \(C(-\hat A/2)C\) is the orthogonal projection of \(-\hat A/2\) onto \(\mathbb {S}_{\boldsymbol {e}}\). Similarly, using the results of Section 5 with \(\mathbb {S}= \mathbb {S}_e\) shows that \(P_k (C(-\hat A/2)C)\) is a rank-\(k\) source matrix which is nearest to \(C(-\hat A/2)C\) with regards to any unitarily invariant norm. Moreover, when using the Frobenius norm we obtain that \(P_k(C(-\hat A/2)C)\) is a rank-\(k\) source matrix that is nearest to \(- \hat A/2\). </p>
<p>The motivation behind the proposed solution is clarified by considering the following two cases. If \(\hat A\) is an “exact" ED matrix, then the nearest rank-\(k\) source matrix is \(C(-\hat A/2)C\), and the related positions matrix, \(X\in \mathbb {R}^{k\times n}\), is a square root of this matrix. That is: </p>
<div class="equation" id="8.14">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.14} C(-\hat A/2)C = X^TX.\end{equation}
  </div>
  <span class="equation_label">8.14</span>
</p>
</div>
<p>Otherwise, when \(\hat A\) is not an “exact" ED matrix, the nearest rank-\(k\) source matrix is given by \(P_k(C(-\hat A)C)\), and the related positions matrix, \(X\), is defined as the square root of this matrix. That is: </p>
<div class="equation" id="8.15">
<p>
  <div class="equation_content">
    \begin{equation} \label{8.15} P_k (C(-\hat A/2)C) = X^TX.\end{equation}
  </div>
  <span class="equation_label">8.15</span>
</p>
</div>
<p>Similar semidefinite programming problems were considered in some earlier works, <i class="it">e.g.</i>, <span class="cite">
	[
	<a href="#17" >17</a>
	, 
	<a href="#20" >20</a>
	, 
	<a href="#25" >25</a>
	]
</span>, but these methods compute the positions matrix, \(X\), in entirely different ways. The majority of the former methods compute \(X\) by solving <a href="#8.13" class="eqref">8.13</a> <i class="itshape">via</i> an iterative method. In contrast, our method computes \(X\) by using a closed form solution. Recall that least squares solutions are sensitive to large errors in the data. Hence the fact that our solution is not aimed at solving <a href="#8.13" class="eqref">8.13</a> does not necessarily mean that we have an “inferior" solution. </p>
<h1 id="a0000000049">9 Concluding remarks</h1>
<p>In this paper we study four types of canonical subspaces. The first one, \(\mathbb {X}\), contains \(m\times n\) matrices whose rows belong to a certain vector space in \(\mathbb {R}^n\). The second type, \(\mathbb {Y}\), contains all the \(m\times n\) matrices whose columns belong to a certain vector space in \(\mathbb {R}^m\). The third type, \(\mathbb {Z}\), contains matrices that satisfy both conditions, while the fourth type, \(\mathbb {S}\), is a symmetric version of \(\mathbb {X}\). Equipping the subspaces with orthonormal bases enables the derivation of orthogonal projections on these subspaces. Let \(A\) be a given matrix and let \(\hat A\) denote the orthogonal projection of \(A\) on one of these subspaces. That is, \(\hat A\) is nearest to \(A\) with regard to Frobenius norm. The subspaces \(\mathbb {X}\) and \(\mathbb {Y}\) have the unique property that \(\hat A\) is also nearest to \(A\) with respect to every other unitarily invariant norm. Yet, as we have seen, orthogonal projections on \(\mathbb {Z}\) and \(\mathbb {S}\) don’t share this feature. </p>
<p>The ability to obtain low-rank approximations of a matrix \(A \in \mathbb {R}^{m\times n}\) on canonical subspaces is based on three key observations. The first is an explicit expression for the orthogonal projection of \(A\) onto the canonical subspace. The second observation is about the singular vectors of a matrix \(B\) that belongs to a certain canonical subspace. Third, the observation that a low-rank approximation of \(B\) belongs to the same canonical subspace as \(B\). It is the combination of these three features that enables us to derive the desired low-rank approximations. </p>
<p>The derivation of rank-\(k\) positive approximants is based on the matrix operator \(P_k(\cdot )\). Let \(S\) be some matrix in \(\mathbb {S}^n\) and let \(\hat S\) denote the orthogonal projection of \(S\) onto \(\mathbb {S}\). Then, as we have seen, \(P_k(\hat S)\) belongs to \(\mathbb {S}\), and \(P_k(\hat S)\) is a rank-\(k\) positive approximant of \(\hat S\) with regard to every unitarily invariant norm. Moreover, when using the Frobenius norm we obtain that \(P_k (\hat S)\) is a rank-\(k\) positive approximant of \(S\) on the subspace \(\mathbb {S}\). </p>
<p>The study of symmetric canonical subspaces reveals a rich geometry which resembles that of \(\mathbb {S}^n\). The computation of positive approximants in \(\mathbb {S}^n\) is closely related to orthogonal projections on the cone of positive semidefinite matrices in \(\mathbb {S}^n\). Similarly, the computation of positive approximants in \(\mathbb {S}\) is equivalent to orthogonal projections on the cone of positive semidefinite matrices in \(\mathbb {S}\). The picture is completed by establishing the related polar decomposition <br />on \(\mathbb {S}\). </p>
<p>The applications described in the last sections illustrate the usefulness of the new results. Let \(A\) be a given data matrix. The first example derives explicit expression for a rank-\(k\) centered matrix that is nearest to \(A\). The properties of this matrix illuminate the PCA of \(A\) in a new light. </p>
<p>The second example comes from the field of Euclidean distance matrices. The fact that we have an explicit formula for \(P_k(\hat S)\) enables us to derive an explicit formula for the nearest source matrix. This provides an effective way for calculating the positions matrix of an erroneous Euclidean distance matrix. </p>
<p><small class="footnotesize">  </small></p>
<div class="bibliography">
<h1>Bibliography</h1>
<dl class="bibliography">
  <dt><a name="1">1</a></dt>
  <dd><p><a href ="https://doi.org/10.1002/wics.101"> <i class="sc">H. Abdi, L.J. Williams</i>, <i class="it">Principal Component Analysis</i>, Wiley Interdisciplinary Reviews: Computational Statistics <b class="bf">2</b> (2010), pp.&#160;433–459. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="2">2</a></dt>
  <dd><p><a href ="https://doi.org/10.1016/0024-3795(85)90230-7"> <i class="sc">T. Ando</i>, <i class="it">Approximation in trace norm by positive semidefinite matrices</i>, Linear Algebra Appl., <b class="bf">71</b> (1985), pp.&#160;15–21. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="3">3</a></dt>
  <dd><p><i class="sc">D.P. Bertsekas</i>, <i class="it">Convex optimization theory</i>, Athena Scientific, 2009. </p>
</dd>
  <dt><a name="4">4</a></dt>
  <dd><p><i class="sc">R. Bhatia</i>, <i class="it">Matrix Analysis</i>, Springer, New York, 1997. </p>
</dd>
  <dt><a name="5">5</a></dt>
  <dd><p><a href ="https://doi.org/10.1016/0024-3795(92)90001-q"> <i class="sc">R. Bhatia, F. Kittaneh</i>, <i class="it">Approximation by positive operators</i>, Linear Algebra Appl., <b class="bf">161</b> (1992), pp.&#160;1–9. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="6">6</a></dt>
  <dd><p><i class="sc">I. Borg, P.J.F. Groenen</i>, <i class="it">Modern multidimensional scaling: Theory and applications</i> (2nd ed.), Springer Series in Statistics, Springer, 2005. </p>
</dd>
  <dt><a name="7">7</a></dt>
  <dd><p><a href ="https://doi.org/10.2307/1996605"> <i class="sc">R. Bouldin</i>, <i class="it">Positive aproximants</i>, Trans. Amer. Math. Soc., <b class="bf">177</b> (1973), pp.&#160;391–403. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="8">8</a></dt>
  <dd><p><i class="sc">D.I. Chu, H.C. Brown, M.T. Chu</i>, <i class="it">On least squares Euclidean distance matrix approximation and completion</i>, Dept. of Mathematics, North Carolina State University, Tech. Rep., 2003. </p>
</dd>
  <dt><a name="9">9</a></dt>
  <dd><p><i class="sc">T.F. Cox, M.A.A. Cox</i>, <i class="it">Multidimensional Scaling</i>, 2nd Ed., Chapman and Hall/CRC, 2001. </p>
</dd>
  <dt><a name="10">10</a></dt>
  <dd><p><i class="sc">G. Crippen, T. Havel</i>, <i class="it">Distance Geometry and Molecular Conformation</i>, Wiley, New York, 1988. </p>
</dd>
  <dt><a name="11">11</a></dt>
  <dd><p><i class="sc">J. Dattorro</i>, <i class="it">Convex optimization and Euclidean distance geometry</i>, Meboo Publishing, USA, 2005. </p>
</dd>
  <dt><a name="12">12</a></dt>
  <dd><p><a href ="https://doi.org/10.1016/j.laa.2009.10.034"> <i class="sc">A. Dax</i>, <i class="it">On extremum properties of orthogonal quotient matrices</i>, Linear Algebra Appl. <b class="bf">432</b> (2010), pp.&#160;1234–1257. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="13">13</a></dt>
  <dd><p><a href ="https://doi.org/10.4236/alamt.2014.43015"> <i class="sc">A. Dax</i>, <i class="it">Low-rank positive approximants of symmetric matrices</i>, Adv. Linear Algebra Matrix Theory, <b class="bf">4</b> (2014), pp.&#160;172–185. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="14">14</a></dt>
  <dd><p><a href ="https://doi.org/10.1007/bf02288367"> <i class="sc">G. Eckart, G. Young</i>, <i class="it">The approximation of one matrix by another of lower rank</i>, Psychometrika, <b class="bf">1</b> (1936), pp.&#160;211–218. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="15">15</a></dt>
  <dd><p><a href ="https://doi.org/10.1073/pnas.37.11.760"> <i class="sc">Ky Fan</i>, <i class="it">Maximum properties and inequalities for the eigenvalues of completely continuous operators</i>, Proc. Nat. Acad. Sci, USA <b class="bf">37</b> (1951), pp.&#160;760–766. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="16">16</a></dt>
  <dd><p><a href ="https://doi.org/10.1090/s0002-9939-1955-0067841-7"> <i class="sc">Ky Fan, A.J. Hoffman</i>, <i class="it">Some metric inequalities in the space of matrices</i>, Proc. Amer. Math. Soc., <b class="bf">6</b> (1955), pp.&#160;111–116. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="17">17</a></dt>
  <dd><p><a href ="https://doi.org/10.1137/0611042"> <i class="sc">W. Glunt, T.L. Hayden, S. Hong, J. Wells</i>, <i class="it">An alternating projection algorithm for computing the nearest Euclidean distance matrix</i>, SIAM J. Matrix Anal. Appl., <b class="bf">11</b> (1990), pp.&#160;589–600. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="18">18</a></dt>
  <dd><p><a href ="https://doi.org/10.1002/jcc.540140115"> <i class="sc">W. Glunt, T.L. Hayden, R. Raydan</i>, <i class="it">Molecular conformations from distance matrices</i>, J. Computational Chemistry, <b class="bf">14</b> (1993), pp.&#160;114–120. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="19">19</a></dt>
  <dd><p><a href ="https://doi.org/10.1512/iumj.1972.21.21076"> <i class="sc">P.R. Halmos</i>, <i class="it">Positive approximants of operators</i>, Indiana Univ. Math. J., <b class="bf">21</b> (1972), pp.&#160;951–960. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="20">20</a></dt>
  <dd><p><a href ="https://doi.org/10.1016/0024-3795(88)90202-9"> <i class="sc">T.L. Hayden, J. Wells</i>, <i class="it">Approximation by matrices positive semidefinite on a subspace</i>, Linear Algebra Appl., <b class="bf">109</b> (1988), pp.&#160;115–130. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="21">21</a></dt>
  <dd><p><a href ="https://doi.org/10.1016/0024-3795(88)90223-6"> <i class="sc">N.J. Higham</i>, <i class="it">Computing a nearest symmetric positive semidefinite matrix</i>, Linear Algebra Appl., <b class="bf">103</b> (1988), pp.&#160;103–118. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="22">22</a></dt>
  <dd><p><i class="sc">N.J. Higham</i>, <i class="it">Matrix nearness problems and applications</i>, in <i class="it">M.J.C. Gover and S. Barnett, editors</i>, Applications of Matrix Theory, pp.&#160;1–27. Oxford University Press, 1989. </p>
</dd>
  <dt><a name="23">23</a></dt>
  <dd><p><a href ="https://doi.org/10.1137/090759392"> <i class="sc">N. Krislock, H. Wolkowicz</i>, <i class="it">Explicit sensor network localization using semidefinite representations and facial reductions</i>, SIAM J. Optim., <b class="bf">20</b> (2010), pp.&#160;2679–2708. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="24">24</a></dt>
  <dd><p><i class="sc">N. Krislock, H. Wolkowicz</i>, <i class="it">Euclidean distance matrices and applications</i>, In <em>Handbook of Semidefinite, Cone and Polynomial Optimization</em>, M. Anjos and J. Lasserre (Eds.), 2010. </p>
</dd>
  <dt><a name="25">25</a></dt>
  <dd><p><a href ="https://doi.org/10.1016/0024-3795(85)90181-8"> <i class="sc">R. Mathar</i>, <i class="it">The best Euclidean fit to a given distance matrix in prescribed dimensions</i>, Linear Algebra Appl., <b class="bf">67</b> (1985), pp.&#160;1–6. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="26">26</a></dt>
  <dd><p><a href ="https://doi.org/10.1093/qmath/11.1.50"> <i class="sc">L. Mirsky</i>, <i class="it">Symmetric Gauge Functions and Unitarily Invariant Norms</i>, Quarterly J. Math., <b class="bf">11</b> (1960), pp.&#160;50–59. <img src="img-0001.png" alt="\includegraphics[scale=0.1]{ext-link.png}" style="width:12.0px; height:10.700000000000001px" />
</a> </p>
</dd>
  <dt><a name="27">27</a></dt>
  <dd><p><i class="sc">J.J. Moreau</i>, <i class="it">Decomposition orthogonal d’un espace hilbertien selon deux cones mutuellement polaires</i>, C.R. Acad. Sci. Paris, <b class="bf">255</b> (1962), pp.&#160;238–240. </p>
</dd>
  <dt><a name="28">28</a></dt>
  <dd><p><i class="sc">D.D. Rogers, J.D. Ward</i>, <i class="it">\(C_p\)-minimal positive approximants</i>, Acta Sci. Math., <b class="bf">43</b> (1981), pp.&#160;109–115. </p>
</dd>
</dl>


</div>
</div> <!--main-text -->
</div> <!-- content-wrapper -->
</div> <!-- content -->
</div> <!-- wrapper -->

<nav class="prev_up_next">
</nav>

<script type="text/javascript" src="/var/www/clients/client1/web1/web/files/jnaat-files/journals/1/articles/js/jquery.min.js"></script>
<script type="text/javascript" src="/var/www/clients/client1/web1/web/files/jnaat-files/journals/1/articles/js/plastex.js"></script>
<script type="text/javascript" src="/var/www/clients/client1/web1/web/files/jnaat-files/journals/1/articles/js/svgxuse.js"></script>
</body>
</html>