I want to use the \"princomp\" function of Matlab but this function gives the eigenvalues in a sorted array. This way I can\'t find out to which column corresponds which eig
With PCA, each principle component returned will be a linear combination of the original columns/dimensions. Perhaps an example might clear up any misunderstanding you have.
Lets consider the Fisher-Iris dataset comprising of 150 instances and 4 dimensions, and apply PCA on the data. To make things easier to understand, I am first zero-centering the data before calling PCA function:
load fisheriris
X = bsxfun(@minus, meas, mean(meas)); %# so that mean(X) is the zero vector
[PC score latent] = princomp(X);
Lets look at the first returned principal component (1st column of PC
matrix):
>> PC(:,1)
0.36139
-0.084523
0.85667
0.35829
This is expressed as a linear combination of the original dimensions, i.e.:
PC1 = 0.36139*dim1 + -0.084523*dim2 + 0.85667*dim3 + 0.35829*dim4
Therefore to express the same data in the new coordinates system formed by the principal components, the new first dimension should be a linear combination of the original ones according to the above formula.
We can compute this simply as X*PC
which is the exactly what is returned in the second output of PRINCOMP (score
), to confirm this try:
>> all(all( abs(X*PC - score) < 1e-10 ))
1
Finally the importance of each principal component can be determined by how much variance of the data it explains. This is returned by the third output of PRINCOMP (latent
).
We can compute the PCA of the data ourselves without using PRINCOMP:
[V E] = eig( cov(X) );
[E order] = sort(diag(E), 'descend');
V = V(:,order);
the eigenvectors of the covariance matrix V
are the principal components (same as PC
above, although the sign can be inverted), and the corresponding eigenvalues E
represent the amount of variance explained (same as latent
). Note that it is customary to sort the principal component by their eigenvalues. And as before, to express the data in the new coordinates, we simply compute X*V
(should be the same as score
above, if you make sure to match the signs)