问题
In Python sklearn KMeans (see documentation), I was wondering what happens internally when passing an ndarray
of shape (n, n_features) to the init
parameter, When n<n_clusters
- Does it drop the given centroids and just starts a kmeans++ initialization which is the default choice for the
init
parameter ? (PDF paper kmeans++) (How does Kmeans++ work) - Does it consider the given centroids and fill accordingly the remaining centroids using kmeans++ ?
- Does it consider the given centroids and fill the remaining centroids using random values ?
I didn't expect that this method returns no warning in this case. That's why I need to know how it manages this.
回答1:
If you give it a mismatching init
it will adjust the number of clusters, as you can see from the source. This is not documented and I would consider it a bug.
I'll propose to fix it.
来源:https://stackoverflow.com/questions/30169378/how-does-sklearn-cluster-kmeans-handle-an-init-ndarray-parameter-with-missing-ce