sci-kit learn TruncatedSVD explained_variance_ratio_ not in descending order? [duplicate]

问题

This question is actually a duplicate of this one, which however remains unanswered at the time of writing.

Why is the explained_variance_ratio_ from TruncatedSVD not in descending order like it would be from PCA? In my experience it seems that the first element of the list is always the lowest, and then at the second element the value jumps up and then goes in descending order from there. Why is explained_variance_ratio_[0] < explained_variance_ratio_[1] ( > explained_variance_ratio_[2] > explained_variance_ratio_[3] ...)? Does this mean the second "component" actually explains the most variance (not the first)?

Code to reproduce behavior:

from sklearn.decomposition import TruncatedSVD

n_components = 50
X_test = np.random.rand(50,100)

model = TruncatedSVD(n_components=n_components, algorithm = 'randomized')
model.fit_transform(X_test)
model.explained_variance_ratio_

回答1:

If you scale the data first, then I think the explained variance ratios will be in descending order:

from sklearn.decomposition import TruncatedSVD
from sklearn.preprocessing import StandardScaler

n_components = 50
X_test = np.random.rand(50,100)

scaler = StandardScaler()
X_test = scaler.fit_transform(X_test)

model = TruncatedSVD(n_components=n_components, algorithm = 'randomized')
model.fit_transform(X_test)
model.explained_variance_ratio_

来源：https://stackoverflow.com/questions/54411576/sci-kit-learn-truncatedsvd-explained-variance-ratio-not-in-descending-order

标签

python

scikit-learn

svd

variance