How I can get the the eigen values and eigen vectors of the PCA application?
from sklearn.decomposition import PCA
clf=PCA(0.98,whiten=True) #converse 98%
When you say "eigenvalues", do you mean the "singular values" for PCA? Eigenvalues are only possible when the matrix PCA applied on are square matrix.
If you are trying to use "eigenvalues" to determine the proper dimension needed for PCA, you should actually use singular values. You can just use pca.singular_values_ to get the singular values.
You are computing the eigenvectors of the correlation matrix, that is the covariance matrix of the normalized variables.
data/=np.std(data, axis=0)
is not part of the classic PCA, we only center the variables.
So the sklearn PCA does not feature scale the data beforehand.
Apart from that you are on the right track, if we abstract the fact that the code you provided did not run ;).
You only got confused with the row/column layouts. Honestly I think it's much easier to start with X = data.T
and work only with X from there on. I added your code 'fixed' at the end of the post.
You already noted that you can get the eigenvectors using clf.components_
.
So you have the principal components. They are eigenvectors of the covariance matrix
I used the sklearn PCA function. The return parameters 'components_' is eigen vectors and 'explained_variance_' is eigen values. Below is my test code.
from sklearn.decomposition import PCA
import numpy as np
def main():
data = np.array([[2.5, 2.4], [0.5, 0.7], [2.2, 2.9], [1.9, 2.2], [3.1, 3.0], [2.3, 2.7], [2, 1.6], [1, 1.1], [1.5, 1.6], [1.1, 0.9]])
print(data)
pca = PCA()
pca.fit(data)
print(pca.components_)
print(pca.explained_variance_)
if __name__ == "__main__":
main()