\( \newcommand{\matr}[1] {\mathbf{#1}} \newcommand{\vertbar} {\rule[-1ex]{0.5pt}{2.5ex}} \newcommand{\horzbar} {\rule[.5ex]{2.5ex}{0.5pt}} \newcommand{\E} {\mathrm{E}} \)
deepdream of
          a sidewalk
Show Question
\( \newcommand{\cat}[1] {\mathrm{#1}} \newcommand{\catobj}[1] {\operatorname{Obj}(\mathrm{#1})} \newcommand{\cathom}[1] {\operatorname{Hom}_{\cat{#1}}} \newcommand{\multiBetaReduction}[0] {\twoheadrightarrow_{\beta}} \newcommand{\betaReduction}[0] {\rightarrow_{\beta}} \newcommand{\betaEq}[0] {=_{\beta}} \newcommand{\string}[1] {\texttt{"}\mathtt{#1}\texttt{"}} \newcommand{\symbolq}[1] {\texttt{`}\mathtt{#1}\texttt{'}} \newcommand{\groupMul}[1] { \cdot_{\small{#1}}} \newcommand{\groupAdd}[1] { +_{\small{#1}}} \newcommand{\inv}[1] {#1^{-1} } \newcommand{\bm}[1] { \boldsymbol{#1} } \require{physics} \require{ams} \require{mathtools} \)
Math and science::Algebra

Viewing matrices through SVD

SVD recap:

Singular value decomposition (SVD)

Let A be an \( m \times n \) matrix. SVD decomposes \( A \) as:

\[ A = U \Sigma V^T \]

where \( U \) and \( V \) are orthogonal matrices and \( \Sigma \) is a diagonal matrix with non-negative real numbers on the diagonal. The diagonal entries of \( \Sigma \) are called the singular values of \( A \).

SVD allows any \( m \) times \( n \) matrix \( A \) to be viewed as a projection into a coordinate space, a scaling, and then a multiplication by another set of basis vectors. To demonstrate, consider the following three cases:

\( A^{T} \) as an unscaled inverse

\[ \begin{align*} A^{-1} &= (U \Sigma V^{T})^{-1} \\ &= (V^{T})^{-1} \Sigma^{-1} U^{-1} \\ &= V \Sigma^{-1} U^{T} \\ \end{align*} \]

\( A^{-1} \) has the same structure as \( A^{T} \), but with the singular values inverted.

\( A^{T}A \) as an input-space→input-space scaling along vectors in \( V \).

\( A^{T}A \) corresponds to converting an input space vector to the coordinate system of \( V \) basic vectors, scaling these \( V \) coordinates, then converting back to the input space.

\[ \begin{align*} A^{T}A &= (U \Sigma V^{T})^{T} (U \Sigma V^{T}) \\ &= V \Sigma U^{T} U \Sigma V^{T} \\ &= V \Sigma^{2} V^{T} \\  \end{align*} \]

Interestingly, the output space and \( U \) are not involved.

\( AA^{T} \) as an output-space→output-space scaling of vectors in \( U \).

\( AA^{T} \) corresponds to converting an output space vector to the coordinate system of \( U \) basic vectors, scaling these \( U \) coordinates, then converting back to the output space.

\[ \begin{align*} AA^{T} &= (U \Sigma V^{T}) (U \Sigma V^{T})^{T} \\ &= U \Sigma V^{T} V \Sigma U^{T} \\ &= U \Sigma^{2} U^{T} \\ \end{align*} \]

Again, the input space and \( V \) are not involved.