PCA(II)

Linear Algebra Involved

Given, the data set X(mxn matrix) where m is the number of measurement types and n is the number of samples.Mathematically ,the goal is to find some orthonormal matrix P in Y=PX such that ${C}_{Y}=\frac {1}{n}(Y{Y}^{T})$ is a diagonal matrix.The rows of P are the principal components of X.

${C}_{Y}=\frac{1}{n}(Y{Y}^{T})$

$=\frac {1}{n}((PX){(PX)}^{T})$

$=\frac {1}{n}(PX{X}^{T}{P}^{T})$

$=P(\frac {1}{n}X{X}^{T}){P}^{T}$

${C}_{Y}=P{C}_{X}{P}^{T}$

Let’s state some elementary linear algebra theorems which  will be helpful in formulating the above formula in a more meaningful way.

• The inverse of an orthogonal matrix is its transpose.
• For any matrix $A$, ${A}^{T}A$,$A{A}^{T}$ are symmetric.
• A matrix is symmetric if and only if it is orthogonally diagonalizable.
• A symmetric matrix is diagonalized by a matrix of its orthonormal eigenvectors i.e ($A=ED{E}^{T}$ where D is the diagonal matrix and E is the matrix with eigenvectors of A arranged as columns)

Now,we select matrix P s.t each row ${p}_{i}$ is an eigenvector of  ${C}_{X}$

${C}_{Y}=P{C}_{X}{P}^{T}$

$=P({E}^{T}DE){P}^{T}$

$=P({P}^{T}DP){P}^{T}$

$=(P{P}^{T})D(P{P}^{T})$

$=(P{P}^{-1})D(P{P}^{-1})$

${C}_{Y}=D$

Therefore,the ${i}^{th}$ diagonal value of ${C}_{Y}$ is the variance of X along ${p}_{i}$

1. In practice,computing PCA of a data set entails:subtracting off the mean of each measurement  type .(PCA doesn’t necessarily give a unique answer.For example if the data matrix is of Temperature,then it might give a different P for Celsius and Fahrenheit measurements.To minimize this error ,mean is subtracted from each measurement.In time series analysis we normally z square the data to achieve the same goal i.e subtrat the mean and divide by std. deviation)
2. and computing the eigenvectors of  ${C}_{X}$