PCA with sklearn. Unable to figure out feature selection with PCA

天大地大妈咪最大 提交于 2019-12-08 07:17:02

问题


I have been trying to do some dimensionality reduction using PCA. I currently have an image of size (100, 100) and I am using a filterbank of 140 Gabor filters where each filter gives me a response which is again an image of (100, 100). Now, I wanted to do feature selection where I only wanted to select non-redundant features and I read that PCA might be a good way to do.

So I proceeded to create a data matrix which has 10000 rows and 140 columns. So, each row contains the various responses of the Gabor filters for that filterbank. Now, as I understand it I can do a decomposition of this matrix using PCA as

from sklearn.decomposition import PCA

pca = pca(n_components = 3)
pca.fit(Q) # Q is my 10000 X 140 matrix

However, now I am confused as to how I can figure out which of these 140 feature vectors to keep from here. I am guessing it should give me 3 of these 140 vectors (corresponding to the Gabor filters which contain the most information about the image) but I have no idea how to proceed from here.


回答1:


PCA will give you a linear combination of features, not a selection of features. It will give you the linear combination that is the best for reconstruction in the L2 sense, aka the one that captures the most variance.

What is you goal? If you do this on one image, any kind of selection will give you features that will discriminate best some parts of an image against other parts of the same image.

Also: Garbor Filters are a sparse basis for natural images. I would not expect anything interesting to happen unless you have very specific images.



来源:https://stackoverflow.com/questions/26757412/pca-with-sklearn-unable-to-figure-out-feature-selection-with-pca

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!