Changes of clustering results after each time run in Python scikit-learn

前端未结

关注

 4  1801

借酒劲吻你 2020-12-17 15:53

I have a bunch of sentences and I want to cluster them using scikit-learn spectral clustering. I\'ve run the code and get the results with no problem. But, every time I run

4条回答

囚心锁ツ (楼主)

2020-12-17 16:31
When using k-means, you want to set the random_state parameter in KMeans (see the documentation). Set this to either an int or a RandomState instance.
```
km = KMeans(n_clusters=number_of_k, init='k-means++', 
            max_iter=100, n_init=1, verbose=0, random_state=3425)
km.fit(X_data)
```
This is important because k-means is not a deterministic algorithm. It usually starts with some randomized initialization procedure, and this randomness means that different runs will start at different points. Seeding the pseudo-random number generator ensures that this randomness will always be the same for identical seeds.

I'm not sure about the spectral clustering example though. From the documentation on the random_state parameter: "A pseudo random number generator used for the initialization of the lobpcg eigen vectors decomposition when eigen_solver == 'amg' and by the K-Means initialization." OP's code doesn't seem to be contained in those cases, though setting the parameter might be worth a shot.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...