Linear Algebra Involved

Given, the data set X(mxn matrix) where m is the number of measurement types and n is the number of samples.Mathematically ,the goal is to find some orthonormal matrix P in Y=PX such that {C}_{Y}=\frac {1}{n}(Y{Y}^{T}) is a diagonal matrix.The rows of P are the principal components of X.


=\frac {1}{n}((PX){(PX)}^{T})

=\frac {1}{n}(PX{X}^{T}{P}^{T})

=P(\frac {1}{n}X{X}^{T}){P}^{T}


Let’s state some elementary linear algebra theorems which  will be helpful in formulating the above formula in a more meaningful way.

  • The inverse of an orthogonal matrix is its transpose.
  • For any matrix A, {A}^{T}A,A{A}^{T} are symmetric.
  • A matrix is symmetric if and only if it is orthogonally diagonalizable.
  • A symmetric matrix is diagonalized by a matrix of its orthonormal eigenvectors i.e (A=ED{E}^{T} where D is the diagonal matrix and E is the matrix with eigenvectors of A arranged as columns)

Now,we select matrix P s.t each row {p}_{i} is an eigenvector of  {C}_{X}







Therefore,the {i}^{th} diagonal value of {C}_{Y} is the variance of X along {p}_{i}

  1. In practice,computing PCA of a data set entails:subtracting off the mean of each measurement  type .(PCA doesn’t necessarily give a unique answer.For example if the data matrix is of Temperature,then it might give a different P for Celsius and Fahrenheit measurements.To minimize this error ,mean is subtracted from each measurement.In time series analysis we normally z square the data to achieve the same goal i.e subtrat the mean and divide by std. deviation)
  2. and computing the eigenvectors of  {C}_{X}

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: