ELKI Kmeans clustering Task failed error for high dimensional data

前端 未结 2 2075
傲寒
傲寒 2021-01-25 22:35

I have a 60000 documents which i processed in gensim and got a 60000*300 matrix. I exported this as a csv file. When i import this in ELKI

2条回答
  •  被撕碎了的回忆
    2021-01-25 23:29

    The error (which took me a bit to understand, when I saw it the first time) says that your data has the "shape"

    variable,mindim=266,maxdim=300
    

    I.e. some lines have only 266 columns, some have 300. This may be a file format issue, for example due to NaN, missing values, or similar bad characters.

    You get that error if you try to run an algorithm like kmeans that assumes the data comes from a R^d vectorspace (that is the NumberVector,field requirement), because the input data is not meeting this requirement.

提交回复
热议问题