I am using SciPy\'s hierarchical agglomerative clustering methods to cluster a m x n matrix of features, but after the clustering is complete, I can\'t seem to figure out how to
A possible solution is a function, which returns a codebook with the centroids like kmeans
in scipy.cluster.vq
does. Only thing you need is the partition as vector with flat clusters part
and the original observations X
def to_codebook(X, part):
"""
Calculates centroids according to flat cluster assignment
Parameters
----------
X : array, (n, d)
The n original observations with d features
part : array, (n)
Partition vector. p[n]=c is the cluster assigned to observation n
Returns
-------
codebook : array, (k, d)
Returns a k x d codebook with k centroids
"""
codebook = []
for i in range(part.min(), part.max()+1):
codebook.append(X[part == i].mean(0))
return np.vstack(codebook)