I have a text corpus that contains 1000+ articles each in a separate line. I am trying to use Hierarchy Clustering using Scipy in python to produce clusters of related artic
You can do the following:
clustering
variable) with your input (the 1000+ articles).groupby function
with the cluster # as its key.get_group function
), fill up a defaultdict
of integers for every
word you encounter.Good luck with what you're doing and please do accept my answer if it's what you're looking for.