choosing bandwidth&linspace for kernel density estimation. (why my bandwidth doesn't work?)

久未见 提交于 2020-02-25 05:03:13

问题


I have followed this link for the application of kernel density estimation. My aim is creating two different groups/clusters or more for an array group. The below code works for every members of array group except this array:

X = np.array([[77788], [77793],[77798], [77803], [92886], [92891], [92896], [92901]])

So my expectation is seeing two different clusters such as:

first_group = ([[77788], [77793],[77798], [77803]])

second_group = ([[92886], [92891], [92896], [92901]])

I have a dynamic list, so I can not fix a value for linspace. Because this array may be 0to 10 or 100000 to 2000000. That's why I have put max and min points of the array in the linspace.

After all, I could not obtain different clusters even though I tried various bandwidths. My code can be seen below:

a = X.reshape(-1,1)
kde = KernelDensity(kernel='gaussian', bandwidth=8).fit(a)
s = linspace(min(a),max(a))
e = kde.score_samples(s.reshape(-1,1))
plot(s, e)

mi, ma = argrelextrema(e, np.less)[0], argrelextrema(e, np.greater)[0]
print("Minima:", s[mi])  # output: []
print("Maxima:", s[ma])  # output: []

s[mi] and s[ma] values are empty which means there is no two different clusters for this array. In the visualization can be seen that we have at least one minimum point. why can not be seen this value for the s[mi] output?

And I applied the same code for different bandwidths which can be seen below, however, there is no minimum or maximum values for this cluster. so any idea what am I doing wrong?

bandwidth=0.008

bandwidth = 0.00002


回答1:


Try a bandwidth of 10000, or try relying on heuristics for choosing the bandwidth.

To make your code more robusty also split clusters at consecutive minima. Because your problem is that there is no unique minimum here, but an interval.



来源:https://stackoverflow.com/questions/60355497/choosing-bandwidthlinspace-for-kernel-density-estimation-why-my-bandwidth-doe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!