Is kd-tree always balanced?

扶醉桌前 提交于 2019-12-03 12:35:05

This is a fairly broad topic and the questions themselves are kind of general. Hopefully this will give you some useful insights and material to work with:

  • Kd tree is not always balanced.
  • AVL and Red-Black will not work with K-D Trees, you will have either construct some balanced variant such as K-D-B-tree or use other balancing techniques.
  • K-d Tree are commonly used to store GeoSpatial data because they let you search over more then one key, contrary to 'traditional' tree which lets you do single dimensional search. GeoSpatial data certainly cannot be represented in single dimension.

Note that there are also specialized databases working with GeoSpatial data so it might be worth checking if the overhead could be shifted to them instead of making your own solution: Although i don't have much experience with this, maybe it is worth checking the postgis.

postgis

Here are some useful links showing how to build balanced K-D tree variant and usage of K-D trees with Spatial data:

balancing K-D-Tree

K-D-B-tree

spatial data k-d-trees

It depends on how you build the tree.

If built as originally published, the tree will be balanced, i.e. only at the leaf level it will have at most a height difference of 1. If your data set has 2^n-1 elements, the tree will be perfectly balanced.

When constructed with the median, then half of the objects must be on either branch of the tree, thus it has minimal height and is balanced.

However, this tree cannot be changed then. I am not aware of an insert or remove algorithm that would preserve this property, but YMMV. I bet there are two dozens of kd-tree extensions that aim at rebalancing and making insertions/deletions more effective.

The k-d-tree is not designed for changes, and will quickly lose efficiency. It relies on the median, and thus any change to the tree would worst-case propagate through all of the tree. Therefore, you need to allow some tolerance in the tree quality to support changes. It appears to be a common approach to just keep track of insertions/deletions and rebuild the tree eventually. You cannot combine it with red-black-trees or AVL-trees, because data with more than 1 dimension is not ordered; these trees only work for ordered data. Upon rotation of the tree the splitting axis changes; and there may be elements in either half that suddenly would need to move to the other branch. This does not happen in AVL or red-black trees.

But as you can imagine, people have published several indexes that remain balanced. Such as k-d-b-trees, and R-trees. These also work better for large data that needs to be stored on disk.

In order to make your kd-tree balanced use median value.
(14,31), (15,32), (17,42), (16,44), (18,52), (16,62)
In the root choose median of x-cordinates [14,15,16,16,17,18] which is 16,
So all the elements less than 16 goes to left part of the tree and
greater than or equal to goes to right side of tree.
as of now,
left part tree consists of [14,31],[15,32] ,now for y-axis find the median for [31,32] so that the tree is balanced

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!