How to use scikit-learn PCA for features reduction and know which features are discarded

前端 未结 3 1574
清歌不尽
清歌不尽 2021-01-30 17:37

I am trying to run a PCA on a matrix of dimensions m x n where m is the number of features and n the number of samples.

Suppose I want to preserve the nf fe

3条回答
  •  孤独总比滥情好
    2021-01-30 18:12

    The projected features onto principal components will retain the important information (axes with maximum variances) and drop axes with small variances. This behavior is like to compression (Not discard).

    And X_proj is the better name of X_new, because it is the projection of X onto principal components

    You can reconstruct the X_rec as

    X_rec = pca.inverse_transform(X_proj) # X_proj is originally X_new
    

    Here, X_rec is close to X, but the less important information was dropped by PCA. So we can say X_rec is denoised.

    In my opinion, I can say the noise is discard.

提交回复
热议问题