R: Clustering results are different everytime I run

后端 未结 3 1852
陌清茗
陌清茗 2020-12-17 04:51
library(amap)
set.seed(5)
Kmeans(mydata, 5, iter.max=500, nstart=1, method=\"euclidean\")

in \'amap\' package and run several times, but even thoug

相关标签:
3条回答
  • 2020-12-17 04:57

    Just a reminder that K-mean results are sensitive to the order of the data points in the data set. If you run again the proper code with randomized data points you will get a different result

    0 讨论(0)
  • 2020-12-17 05:13

    You must be doing something wrong. I get reproducible results each time I run the following code, as long as I set the seed before each call to Kmeans():

    library(amap)
    
    out <- vector(mode = "list", length = 10)
    for(i in seq_along(out)) {
        set.seed(1)
        out[[i]] <- Kmeans(iris[, -5], 3, iter.max=500, nstart=1, method="euclidean")
    }
    
    for(i in seq_along(out[-1])) {
        print(all.equal(out[[i]], out[[i+1]]))
    }
    

    The last for loop prints:

    [1] TRUE
    [1] TRUE
    [1] TRUE
    [1] TRUE
    [1] TRUE
    [1] TRUE
    [1] TRUE
    [1] TRUE
    [1] TRUE
    

    Indicating the results are exactly the same each time.

    0 讨论(0)
  • 2020-12-17 05:17

    Have you set the seed? set.seed(1)

    Everytime K-Means initializes the centroid, it is generated randomly, which is needing seed for generating random values.

    0 讨论(0)
提交回复
热议问题