0
2.1kviews
Singular Value Decomposition
1 Answer
0
11views

The Singular-Value Decomposition, or SVD for short, is a matrix decomposition method for reducing a matrix to its constituent parts in order to make certain subsequent matrix calculations simpler. Singular value decomposition is a method of decomposing a matrix into three other matrices:

$A=USV^{T}$

Where:

  • A is an m$\times$n matrix

  • U is an m$\times$n orthogonal matrix

  • S is an n$\times$n diagonal matrix

  • V is an n$\times$n orthogonal matrix

SVD of real symmetric matrix:

  • A is real symmetric of size n$\times$n

  • $A=A^{T}$

  • U=V since $A^{T}A=AA^{T}=A^{2}$

$A=Q\sum Q^{T}$

  • Q is of size $n \times n$ and cootnains eigen vectors of $A^{2}$

  • This is called spectral decomposition of A

  • $\sum$ contains n singular values

  • Eigenvectors of A= eigenvectors of $A^{2}$

  • Eigen values of A=square root of eigen values of $A^{2}$

  • Eigen values of A=singular values of A

Transformation using SVD

  • Transformed data

$T=AV=U\sum$

  • V is called SVD transform matrix

  • Essentially, T is just a rotation of A

  • Dimensionality of T is n

  • n different basis vectors than the original space

  • columns of V give the basis vectors in rotated space

  • V shows how each dimension can be represented as a linear combination of other dimensions

  • Columns are input basis vectors

  • U shows how each object can be represented as a linear combination of other objects

  • Columns are output basis vectors

  • Length of vectors are presented

$\vert|\overrightarrow a_{i}\vert|_{2} =\vert|\overrightarrow t_{i}\vert|_{2}$

Example

$A\left[ \begin{matrix} 2 & 4 \\ 1 & 3 \\ 0 & 1 \\ -1 & 0.5 \end{matrix} \right] =U\begin{bmatrix} -0.80 & 0.22 & 0.05 & 0.54 \\ -0.56 & -0.20 & -0.34 & -0.71 \\ -0.16 & -0.31 & 0.90 & -0.21 \\ -0.01 & -0.89 & -0.22 & 0.37 \end{bmatrix}\times \sum \left[ \begin{matrix} 5.54 & 0 \\ 0 & 1.24 \\ 0 & 0 \\ 0 & 0 \end{matrix} \right] \times V^{ T }\begin{bmatrix} -0.39 & -0.92 \\ 0.92 & -0.39 \end{bmatrix}^T$

$T=AV=U\sum=\begin{bmatrix} 2 & 4\\ 1 & 3\\ 0 & 1\\ -1 & 0.5 \end{bmatrix} \times\begin{bmatrix} -0.39 & -0.92\\ 0.92 & -0.39\\ \end{bmatrix} =\begin{bmatrix} -4.46 & 0.27\\ -3.15 & -0.25\\ -0.92 & -0.39\\ -0.06 & -1.15 \end{bmatrix}$

Example compact form

$A=\begin{bmatrix} 2 & 4\\ 1 & 3\\ 0 & 1\\ -1 & 0.5 \end{bmatrix}=U\begin{bmatrix} -0.80 & 0.22\\ 0.56 & -0.20\\ -0.16 & -0.31\\ -0.01 & -0.89 \end{bmatrix}\times \sum\begin{bmatrix} 5.54 & 0\\ 0 & 1.24\\ \end{bmatrix}\times V^{T}\begin{bmatrix} -0.39 & -0.92\\ 0.92 & -0.39\\ \end{bmatrix}^{T}$

  • If A is to size m$\times$n, then U is m$\times$n, V is n$\times$n and $\sum$ is n$\times$n matrix

  • Works because there at most n non zero singular values in $\sum$

Dimensionality reduction using SVD

$A=U\sum V^{T}={\substack{\sum_{k=1}^{n}}}(u_{i}\sigma_{ii}v_{i}^{T})$

  • Use only k dimensions

  • Retain first k columns for U and V and first k values for $\sum$

  • first k columns of V give the basis vectors in reduced space

  • Best rank k approximation in terms of sum squared error

$A\approx \substack{\sum_{i=1}^{k}(v_{i}}\sigma_{ii}v_{i}^{T})=U_{1....k}\sum_{1...k}V^{T}_{1...K}$

$T\approx A V_{1...k}$

Example of dimensionality reduction (k=1)

$A\approx A_{k}= U_{k}\begin{bmatrix} -0.80 & 0 & 0 & 0\\ -0.56 & 0 & 0 0\\ -0.16 & 0 & 0 & 0\\ -0.01 & 0 & 0 & 0 \end{bmatrix} \times \sum_{k}\begin{bmatrix} 5.54 & 0\\ 0 & 0\\ 0 & 0\\ 0 & 0 \end{bmatrix}\times V_{k}^{T}\begin{bmatrix} -0.39 & 0\\ -0.92 & 0\\ \end{bmatrix}^{T}=\begin{bmatrix} 1.74 & 4.10\\ 1.23 & 2.90\\ 0.35 & 0.84\\ 0.02 & 0.06 \end{bmatrix}$

$T\approx T_{k}=AV_{k}=U_{k}\sum_{k}=\begin{bmatrix} 2 & 4\\ 1 & 3\\ 0 & 1\\ -1 & 0.5 \end{bmatrix}\times\begin{bmatrix} -0.39 & 0\\ 0.92 & 0\\ \end{bmatrix}=\begin{bmatrix} -4.46 & 0\\ -3.15 & 0\\ -0.92 & 0\\ -0.06 & 0 \end{bmatrix}$

Compact way of dimensionality reduction (k=1)

$A\approx A_{K}=U_{k}\begin{bmatrix} -0.80\\ -0.56\\ -0.16\\ -0.01 \end{bmatrix}\times \sum_{k}[5.54]\times V_{k}^{T}\begin{bmatrix} -0.39\\ 0.92 \end{bmatrix} ^{T}=\begin{bmatrix} 1.74 & 4.10\\ 1.23 & 2.90\\ 0.35 & 0.84\\ 0.02 & 0.06 \end{bmatrix}$

$T\approx T_{K}=AV_{k}=U_{k}\sum_{k}=\begin{bmatrix} 2 & 4\\ 1 & 3\\ 0 & 1\\ -1 & 0.5 \end{bmatrix}\times\begin{bmatrix} -0.39 \\ 0.92 \end{bmatrix}=\begin{bmatrix} -4.46\\ -3.15\\ -0.92\\ -0.06 \end{bmatrix}$

Dimensions to retain:

  • Concept of energy of a dataset
  • Total energy is sum of squares of singular value a

$E=\substack{\sum_{i=1}^{n}}\sigma_{ii}^{2}$

  • Retain k dimesnions such that p% of the energy is retained

$E_{k}=\substack{\sum_{i=1}^{k}}\sigma_{ii}^{2}$

$E_{k}/E\geq p$

  • Generally, p is between 80% to 95%

Advantages of Dimensionality Reduction

  • It helps in data compression, and hence reduced storage space.
  • It reduces computation time.
  • It also helps remove redundant features, if any

Disadvantages of Dimensionality Reduction

  • It may lead to some amount of data loss.
  • PCA tends to find linear correlations between variables, which is sometimes undesirable.
  • PCA fails in cases where mean and covariance are not enough to define datasets.
  • We may not know how many principal components to keep- in practice, some thumb rules are applied.

enter image description here

  • It is helpful in noise removal also and as result of that we can improve the performance of models.
Please log in to add an answer.