What is the difference between these two algorithms?
In a use case (5D nearest neighbor look ups in a KDTree with approximately 100K points) cKDTree is around 12x faster than KDTree.
Currently, both have almost same APIs, and cKDTree
is faster than KDTree
.
So, In the near future, SciPy developers are planning to remove KDTree
, and cKDTree
will be renamed to KDTree
in a backwards-compatible way.
Ref: Detailed SciPy Roadmap — SciPy v1.6.0.dev Reference Guide http://scipy.github.io/devdocs/roadmap-detailed.html#spatial
cKDTree is a subset of KDTree, implemented in C++ wrapped in Cython, so therefore faster.
Each of them is
a binary trie, each of whose nodes represents an axis-aligned hyperrectangle. Each node specifies an axis and splits the set of points based on whether their coordinate along that axis is greater than or less than a particular value.
but KDTree
also supports all-neighbors queries, both with arrays of points and with other kd-trees. These do use a reasonably efficient algorithm, but the kd-tree is not necessarily the best data structure for this sort of calculation.