问题
I am trying to use ELKI's SLINK implementation of hierarchical clustering in my program.
I have a set of objects (of my own type) that need to be clustered. For that, I convert them to feature vectors before clustering.
This is how I currently got it to run and produce some result (code is in Scala):
val clusterer = new SLINK(CosineDistanceFunction.STATIC, 3)
val connection = new ArrayAdapterDatabaseConnection(featureVectors)
val database = new StaticArrayDatabase(connection, null)
database.initialize()
val result = clusterer.run(database).asInstanceOf[Clustering[_ <: Model]]
Now, the result is a Clustering
that contains elements of type Model
. I can output them, but I don't know how to make sense of this result, especially since SLINK
returns models of type DendrogramModel
which does not seem to be parametrizable.
Specifically, how can I link the results back to my original elements (the ones from which I created the variable featureVectors
earlier)?
I assume I need to create some kind of custom model or somehow maintain some link to the original elements through initialization and execution of the algorithm to retrieve from the result. I cannot find where to get started on this though.
I am aware that embedding ELKI into own programs is discouraged. However, it seems that calling ELKI in some other way would not be any different: I need to cluster and map the results back to my objects during runtime of my program.
回答1:
The DendrogramModel
does not include the objects in the cluster. Models are additional meta data on the clusters.
Use the getIDs() method to access the members of a Cluster instance.
来源:https://stackoverflow.com/questions/17687533/using-elki-on-custom-objects-and-making-sense-of-results