I wish to generate a dendrogram based on correlation using pandas
and scipy
. I use a dataset (as a DataFrame
) consisting of returns, whic
Found the solution. If you have already calculated a distance matrix (be it correlation or whatever), you simply have to condense the matrix using distance.squareform
. That is,
dataframe = pd.DataFrame(data=random_returns, index=dates)
corr = 1 - dataframe.corr()
corr_condensed = hc.distance.squareform(corr) # convert to condensed
z = hc.linkage(corr_condensed, method='average')
dendrogram = hc.dendrogram(z, labels=corr.columns)
plt.show()