问题
all, I have a correlation matrix of 21 industry sectors. Now I want to split these 21 sectors into 4 or 5 groups, with sectors of similar behaviors grouped together.
Can experts shed me some lights on how to do this in Python please? Thanks much in advance!
回答1:
You might explore the use of Pandas DataFrame.corr and the scipy.cluster Hierarchical Clustering package
import pandas as pd
import scipy.cluster.hierarchy as spc
df = pd.DataFrame(my_data)
corr = df.corr().values
pdist = spc.distance.pdist(corr)
linkage = spc.linkage(pdist, method='complete')
idx = spc.fcluster(linkage, 0.5 * pdist.max(), 'distance')
来源:https://stackoverflow.com/questions/52787431/create-clusters-using-correlation-matrix-in-python