问题
How to find the optimal point and appropriate amount for DBSCAN() parameters(eps,Minpts)?
DBSCAN() from package fpc implements the DBSCAN(Density based clustering) clustering method.
回答1:
You can find strategies for choosing minPts and epsilon discussed in the original DBSCAN paper:
Ester, M., Kriegel, H. P., Sander, J., & Xu, X. (1996, August). A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD (Vol. 96, No. 34, pp. 226-231).
Also read up on some newer developments:
Schubert, E., Sander, J., Ester, M., Kriegel, H. P., & Xu, X. (2017). DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN. ACM Transactions on Database Systems (TODS), 42(3), 19.
This newer article also discusses how to set, and how to not set the parameters. It provides some interesting insight what can go wrong.
I didn't find an open access version of this article, but you can use Sci-Hub (Wikipedia).
And, of course, if choosing epsilon is difficult, you may want to use OPTICS or HDBSCAN* instead.
回答2:
This is discussed in ?dbscan
in package dbscan
:
"Setting parameters for DBSCAN: minPts is often set to be dimensionality of the data plus one or higher. The knee in kNNdistplot
can be used to find suitable values for eps."
来源:https://stackoverflow.com/questions/47110357/how-to-find-the-optimal-point-for-dbscan-parameters-in-r