Why lookup in a Binary Search Tree is O(log(n))?

后端 未结 5 1424
失恋的感觉
失恋的感觉 2021-01-31 18:20

I can see how, when looking up a value in a BST we leave half the tree everytime we compare a node with the value we are looking for.

However I fail to

5条回答
  •  深忆病人
    2021-01-31 19:00

    This can be shown mathematically very easily.

    Before I present that, let me clarify something. The complexity of lookup or find in a balanced binary search tree is O(log(n)). For a binary search tree in general, it is O(n). I'll show both below.

    In a balanced binary search tree, in the worst case, the value I am looking for is in the leaf of the tree. I'll basically traverse from root to the leaf, by looking at each layer of the tree only once -due to the ordered structure of BSTs. Therefore, the number of searches I need to do is number of layers of the tree. Hence the problem boils down to finding a closed-form expression for the number of layers of a tree with n nodes.

    This is where we'll do a simple induction. A tree with only 1 layer has only 1 node. A tree of 2 layers has 1+2 nodes. 3 layers 1+2+4 nodes etc. The pattern is clear: A tree with k layers has exactly

    n=2^0+2^1+...+2^{k-1}

    nodes. This is a geometric series, which implies

    n=2^k-1,

    equivalently:

    k = log(n+1)

    We know that big-oh is interested in large values of n, hence constants are irrelevant. Hence the O(log(n)) complexity.

    I'll give another -much shorter- way to show the same result. Since while looking for a value we constantly split the tree into two halves, and we have to do this k times, where k is number of layers, the following is true:

    (n+1)/2^k = 1,

    which implies the exact same result. You have to convince yourself about where that +1 in n+1 is coming from, but it is okay even if you don't pay attention to it, since we are talking about large values of n.

    Now let's discuss the general binary search tree. In the worst case, it is perfectly unbalanced, meaning all of its nodes has only one child (and it becomes a linked list) See e.g. https://www.cs.auckland.ac.nz/~jmor159/PLDS210/niemann/s_fig33.gif

    In this case, to find the value in the leaf, I need to iterate on all nodes, hence O(n).

    A final note is that these complexities hold true for not only find, but also insert and delete operations.

    (I'll edit my equations with better-looking Latex math styling when I reach 10 rep points. SO won't let me right now.)

提交回复
热议问题