Getting Xmeans clusterer output programmatically in Weka

后端 未结 1 1277
死守一世寂寞
死守一世寂寞 2021-02-06 14:18

When using Kmeans in Weka, one can call getAssignments() on the resulting output of the model to get the cluster assignment for each given instance. Here\'s a (truncated) Jython

相关标签:
1条回答
  • 2021-02-06 15:11

    Here's a reply to my question from the Weka listserv:

     "Not as such. But all clusterers have a clusterInstance() method. You can 
     pass each training instance through the trained clustering model to 
     obtain the cluster index for each."
    

    Here's my Jython implementation of this suggestion:

     >>> import java.io.FileReader as FileReader
     >>> import weka.core.Instances as Instances
     >>> import weka.clusterers.XMeans as xmeans
     >>> import java.io.BufferedReader as read
     >>> import java.io.FileReader
     >>> import java.io.File
     >>> read = read(FileReader("some arff file"))
     >>> data = Instances(read)
     >>> file = FileReader("some arff file")
     >>> data = Instances(file)
     >>> xmeans = xmeans()
     >>> xmeans.setMaxNumClusters(100)  
     >>> xmeans.setMinNumClusters(2) 
     >>> xmeans.buildClusterer(data)# here's our model 
     >>> enumerated_instances = data.enumerateInstances() #get the index of each instance 
     >>> for index, instance in enumerate(enumerated_instances):
             cluster_num = xmeans.clusterInstance(instance) #pass each instance through the model
             print "instance # ",index,"is in cluster ", cluster_num #pretty print results
    
     instance # 0 is in cluster  1
     instance # 1 is in cluster  1
     instance # 2 is in cluster  0
     instance # 3 is in cluster  0
    

    I'm leaving all of this up as a reference, since the same approach could be use to get cluster assignments for the results of any of Weka's clusterers.

    0 讨论(0)
提交回复
热议问题