written 5.7 years ago by |
The Singular-Value Decomposition, or SVD for short, is a matrix decomposition method for reducing a matrix to its constituent parts in order to make certain subsequent matrix calculations simpler. Singular value decomposition is a method of decomposing a matrix into three other matrices:
$A=USV^{T}$
Where:
A is an m$\times$n matrix
U is an m$\times$n orthogonal matrix
S is an n$\times$n diagonal matrix
V is an n$\times$n orthogonal matrix
SVD of real symmetric matrix:
A is real symmetric of size n$\times$n
$A=A^{T}$
U=V since $A^{T}A=AA^{T}=A^{2}$
$A=Q\sum Q^{T}$
Q is of size $n \times n$ and cootnains eigen vectors of $A^{2}$
This is called spectral decomposition of A
$\sum$ contains n singular values
Eigenvectors of A= eigenvectors of $A^{2}$
Eigen values of A=square root of eigen values of $A^{2}$
Eigen values of A=singular values of A
Transformation using SVD
- Transformed data
$T=AV=U\sum$
V is called SVD transform matrix
Essentially, T is just a rotation of A
Dimensionality of T is n
n different basis vectors than the original space
columns of V give the basis vectors in rotated space
V shows how each dimension can be represented as a linear combination of other dimensions
Columns are input basis vectors
U shows how each object can be represented as a linear combination of other objects
Columns are output basis vectors
Length of vectors are presented
$\vert|\overrightarrow a_{i}\vert|_{2} =\vert|\overrightarrow t_{i}\vert|_{2}$
Example
$A\left[ \begin{matrix} 2 & 4 \\ 1 & 3 \\ 0 & 1 \\ -1 & 0.5 \end{matrix} \right] =U\begin{bmatrix} -0.80 & 0.22 & 0.05 & 0.54 \\ -0.56 & -0.20 & -0.34 & -0.71 \\ -0.16 & -0.31 & 0.90 & -0.21 \\ -0.01 & -0.89 & -0.22 & 0.37 \end{bmatrix}\times \sum \left[ \begin{matrix} 5.54 & 0 \\ 0 & 1.24 \\ 0 & 0 \\ 0 & 0 \end{matrix} \right] \times V^{ T }\begin{bmatrix} -0.39 & -0.92 \\ 0.92 & -0.39 \end{bmatrix}^T$
$T=AV=U\sum=\begin{bmatrix} 2 & 4\\ 1 & 3\\ 0 & 1\\ -1 & 0.5 \end{bmatrix} \times\begin{bmatrix} -0.39 & -0.92\\ 0.92 & -0.39\\ \end{bmatrix} =\begin{bmatrix} -4.46 & 0.27\\ -3.15 & -0.25\\ -0.92 & -0.39\\ -0.06 & -1.15 \end{bmatrix}$
Example compact form
$A=\begin{bmatrix} 2 & 4\\ 1 & 3\\ 0 & 1\\ -1 & 0.5 \end{bmatrix}=U\begin{bmatrix} -0.80 & 0.22\\ 0.56 & -0.20\\ -0.16 & -0.31\\ -0.01 & -0.89 \end{bmatrix}\times \sum\begin{bmatrix} 5.54 & 0\\ 0 & 1.24\\ \end{bmatrix}\times V^{T}\begin{bmatrix} -0.39 & -0.92\\ 0.92 & -0.39\\ \end{bmatrix}^{T}$
If A is to size m$\times$n, then U is m$\times$n, V is n$\times$n and $\sum$ is n$\times$n matrix
Works because there at most n non zero singular values in $\sum$
Dimensionality reduction using SVD
$A=U\sum V^{T}={\substack{\sum_{k=1}^{n}}}(u_{i}\sigma_{ii}v_{i}^{T})$
Use only k dimensions
Retain first k columns for U and V and first k values for $\sum$
first k columns of V give the basis vectors in reduced space
Best rank k approximation in terms of sum squared error
$A\approx \substack{\sum_{i=1}^{k}(v_{i}}\sigma_{ii}v_{i}^{T})=U_{1....k}\sum_{1...k}V^{T}_{1...K}$
$T\approx A V_{1...k}$
Example of dimensionality reduction (k=1)
$A\approx A_{k}= U_{k}\begin{bmatrix} -0.80 & 0 & 0 & 0\\ -0.56 & 0 & 0 0\\ -0.16 & 0 & 0 & 0\\ -0.01 & 0 & 0 & 0 \end{bmatrix} \times \sum_{k}\begin{bmatrix} 5.54 & 0\\ 0 & 0\\ 0 & 0\\ 0 & 0 \end{bmatrix}\times V_{k}^{T}\begin{bmatrix} -0.39 & 0\\ -0.92 & 0\\ \end{bmatrix}^{T}=\begin{bmatrix} 1.74 & 4.10\\ 1.23 & 2.90\\ 0.35 & 0.84\\ 0.02 & 0.06 \end{bmatrix}$
$T\approx T_{k}=AV_{k}=U_{k}\sum_{k}=\begin{bmatrix} 2 & 4\\ 1 & 3\\ 0 & 1\\ -1 & 0.5 \end{bmatrix}\times\begin{bmatrix} -0.39 & 0\\ 0.92 & 0\\ \end{bmatrix}=\begin{bmatrix} -4.46 & 0\\ -3.15 & 0\\ -0.92 & 0\\ -0.06 & 0 \end{bmatrix}$
Compact way of dimensionality reduction (k=1)
$A\approx A_{K}=U_{k}\begin{bmatrix} -0.80\\ -0.56\\ -0.16\\ -0.01 \end{bmatrix}\times \sum_{k}[5.54]\times V_{k}^{T}\begin{bmatrix} -0.39\\ 0.92 \end{bmatrix} ^{T}=\begin{bmatrix} 1.74 & 4.10\\ 1.23 & 2.90\\ 0.35 & 0.84\\ 0.02 & 0.06 \end{bmatrix}$
$T\approx T_{K}=AV_{k}=U_{k}\sum_{k}=\begin{bmatrix} 2 & 4\\ 1 & 3\\ 0 & 1\\ -1 & 0.5 \end{bmatrix}\times\begin{bmatrix} -0.39 \\ 0.92 \end{bmatrix}=\begin{bmatrix} -4.46\\ -3.15\\ -0.92\\ -0.06 \end{bmatrix}$
Dimensions to retain:
- Concept of energy of a dataset
- Total energy is sum of squares of singular value a
$E=\substack{\sum_{i=1}^{n}}\sigma_{ii}^{2}$
- Retain k dimesnions such that p% of the energy is retained
$E_{k}=\substack{\sum_{i=1}^{k}}\sigma_{ii}^{2}$
$E_{k}/E\geq p$
- Generally, p is between 80% to 95%
Advantages of Dimensionality Reduction
- It helps in data compression, and hence reduced storage space.
- It reduces computation time.
- It also helps remove redundant features, if any
Disadvantages of Dimensionality Reduction
- It may lead to some amount of data loss.
- PCA tends to find linear correlations between variables, which is sometimes undesirable.
- PCA fails in cases where mean and covariance are not enough to define datasets.
- We may not know how many principal components to keep- in practice, some thumb rules are applied.
- It is helpful in noise removal also and as result of that we can improve the performance of models.