Is it possible to run a clustering algorithm with chunked distance matrices?
问题 I have a distance/dissimilarity matrix (30K rows 30K columns) that is calculated in a loop and stored in ROM. I would like to do clustering over the matrix. I import and cluster it as below: Mydata<-read.csv("Mydata.csv") Mydata<-as.dist(Mydata) Results<-hclust(Mydata) But when I convert the matrix to dist object, I get RAM limitation error. How can I handle it? Can I run hclust algorithm in a loop/chunking? I mean I divide the distance matrix into chunks and run them in a loop? 回答1: You may