K means finding elbow when the elbow plot is a smooth curve

て烟熏妆下的殇ゞ 提交于 2019-12-05 05:35:33

I think that it is better to use only your "within class distortion" as optimization parameter:

%% Compute within class distortion
muB = repmat(mu(nn,:),length(I),1);
distort = distort+sum(sum((CSDmat(I,:)-muB).^2));

Use this without dividing this value by "distort_across". If you calculate the "derivate" of this:

unexplained_error = within_class_distortion;
derivative = diff(unexplained_error);
plot(derivative)

The derivative(k) tells you how much the unexplained error has decreased by adding a new cluster. I suggest that you stop adding clusters when the decrease on this error is less than ten times the first decrease you obtained.

for (i=1:length(derivative))
    if (derivative(i) < derivative(1)/10)
         break
    end
end
k_opt = i+1;

In fact the method to obtain the optimum number of clusters is application dependent, but I think that you can obtain a good value of k using this suggestion.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!