I\'m trying to find the definition of a binary search tree and I keep finding different definitions everywhere.
Some say that for any given subtree the left child k
Those three things you said are all true.
I suppose you could reverse your tree and put the smaller keys on the right, but really the "left" and "right" concept is just that: a visual concept to help us think about a data structure which doesn't really have a left or right, so it doesn't really matter.
The elements ordering relation <= is a total order so the relation must be reflexive but commonly a binary search tree (aka BST) is a tree without duplicates.
Otherwise if there are duplicates you need run twice or more the same function of deletion!
1.) left <= root < right
2.) left < root <= right
3.) left < root < right, such that no duplicate keys exist.
I might have to go and dig out my algorithm books, but off the top of my head (3) is the canonical form.
(1) or (2) only come about when you start to allow duplicates nodes and you put duplicate nodes in the tree itself (rather than the node containing a list).
Duplicate Keys • What happens if there's more than one data item with the same key? – This presents a slight problem in red-black trees. – It's important that nodes with the same key are distributed on both sides of other nodes with the same key. – That is, if keys arrive in the order 50, 50, 50, • you want the second 50 to go to the right of the first one, and the third 50 to go to the left of the first one. • Otherwise, the tree becomes unbalanced. • This could be handled by some kind of randomizing process in the insertion algorithm. – However, the search process then becomes more complicated if all items with the same key must be found. • It's simpler to outlaw items with the same key. – In this discussion we'll assume duplicates aren't allowed
One can create a linked list for each node of the tree that contains duplicate keys and store data in the list.
All three definitions are acceptable and correct. They define different variations of a BST.
Your college data structure's book failed to clarify that its definition was not the only possible.
Certainly, allowing duplicates adds complexity. If you use the definition "left <= root < right" and you have a tree like:
3
/ \
2 4
then adding a "3" duplicate key to this tree will result in:
3
/ \
2 4
\
3
Note that the duplicates are not in contiguous levels.
This is a big issue when allowing duplicates in a BST representation as the one above: duplicates may be separated by any number of levels, so checking for duplicate's existence is not that simple as just checking for immediate childs of a node.
An option to avoid this issue is to not represent duplicates structurally (as separate nodes) but instead use a counter that counts the number of occurrences of the key. The previous example would then have a tree like:
3(1)
/ \
2(1) 4(1)
and after insertion of the duplicate "3" key it will become:
3(2)
/ \
2(1) 4(1)
This simplifies lookup, removal and insertion operations, at the expense of some extra bytes and counter operations.
I just want to add some more information to what @Robert Paulson answered.
Let's assume that node contains key & data. So nodes with the same key might contain different data.
(So the search must find all nodes with the same key)
- left <= cur < right
- left < cur <= right
- left <= cur <= right
- left < cur < right && cur contain sibling nodes with the same key.
- left < cur < right, such that no duplicate keys exist.
1 & 2. works fine if the tree does not have any rotation-related functions to prevent skewness.
But this form doesn't work with AVL tree or Red-Black tree, because rotation will break the principal.
And even if search() finds the node with the key, it must traverse down to the leaf node for the nodes with duplicate key.
Making time complexity for search = theta(logN)
3. will work well with any form of BST with rotation-related functions.
But the search will take O(n), ruining the purpose of using BST.
Say we have the tree as below, with 3) principal.
12
/ \
10 20
/ \ /
9 11 12
/ \
10 12
If we do search(12) on this tree, even tho we found 12 at the root, we must keep search both left & right child to seek for the duplicate key.
This takes O(n) time as I've told.
4. is my personal favorite. Let's say sibling means the node with the same key.
We can change above tree into below.
12 - 12 - 12
/ \
10 - 10 20
/ \
9 11
Now any search will take O(logN) because we don't have to traverse children for the duplicate key.
And this principal also works well with AVL or RB tree.