问题
Possible Duplicate:
How do I determine k when using k-means clustering?
How can i choose the K initially, if i do not know about the data?
Can someone help me in choosing the K.
Thanks Navin
回答1:
The base idea is to evaluate cluster scoring on sample data, usally it is distance inside cluster and distance between clusters. The more this measure the better clustering, based on this mesure you can select best clustring paramters. One of metrics can be found here http://alias-i.com/lingpipe/docs/api/com/aliasi/cluster/ClusterScore.html
回答2:
Seriously, what do you want to know? Do you want us to tell you some number? Or a strategy how to find the optimal k
? You have to read a book or other resources about k-means, I'm pretty sure it is covered there.
There is something on Wikipedia about it:
http://en.wikipedia.org/wiki/Determining_the_number_of_clusters_in_a_data_set
Before you use an algorithm, read about it.
来源:https://stackoverflow.com/questions/6212690/how-to-optimal-k-in-k-means-algorithm