Is it possible to get same kmeans clusters for every execution for a particular data set. Just like for a random value we can use a fixed seed. Is it possible to stop random
Yes, calling set.seed(foo)
immediately prior to running kmeans(....)
will give the same random start and hence the same clustering each time. foo
is a seed, like 42
or some other numeric value.
Yes. Use set.seed
to set a seed for the random value before doing the clustering.
Using the example in kmeans
:
set.seed(1)
x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
colnames(x) <- c("x", "y")
set.seed(2)
XX <- kmeans(x, 2)
set.seed(2)
YY <- kmeans(x, 2)
Test for equality:
identical(XX, YY)
[1] TRUE