This earlier question asked how many ways there were to insert the values 1 - 7 into a binary search tree that would result in the following tree:
4
We can solve this using a clever recursive algorithm.
As our base case, if you are given an empty tree, there is exactly one way to build that tree - don't insert any values.
For the recursive case, let's suppose that you have a BST that looks like this:
v
/ \
L R
Here, v is the root, and L and R are the right subtrees, respectively.
If we want to build up this binary search tree, we would have to start off by inserting v first - if we didn't, v wouldn't be the root. After we insert v, we need to insert the elements in an ordering that will cause the subtrees L and R to be rebuilt. The tricky part here is that we don't need to build up all of L before building up R or vice-versa; we could insert some elements from L, then some elements from R, then more elements from L, then more elements from R, etc.
Fortunately, though, there is a useful observation we can make. Suppose that SL is a sequence of elements that, if inserted into a BST, forms the BST L. Similarly, let SR be a sequence of elements that if inserted into a BST form the BST R. If we consider any possible interleaving of the sequences SL and SR, we will end up with a sequence of elements that, if inserted into a BST containing just v, will build up the tree
v
/ \
L R
As an example, consider this tree:
4
/ \
2 6
/ \ / \
1 3 5 7
One possible sequence that builds the left subtree (holding 1, 2, 3) is 2, 1, 3. One possible sequence that builds the right subtree is 6, 5, 7. Any possible interleaving of those sequences, when inserted into a BST containing just the root node 4, will end up building out the above BST. For example, any of these sequences will work:
2, 1, 3, 6, 5, 7
2, 6, 1, 5, 3, 7
6, 5, 2, 1, 3, 7
...
Since given any sequences SL and SR that build up L and R we can interleave them in any order, we can write out a nice formula for the number of sequences that will build out a tree with v as its root, L as its left subtree, and R as its right subtree:
# ways = (# interleaves of SL and SR) × (# possible SLs) × (# possible SRs)
If you think about it, the last two terms in this product can be found by recursively finding the number of insertion sequences that work for the left and right subtrees. This means that if we can figure out how many possible interleavings there are for the two sequences, then we can determine how many possible insertion sequences will build up a given tree by evaluating the above expression recursively!
So how many interleavings are there? If we're given two sequences of length m and n with no duplicates in them, then we can come up with the number of interleavings of those sequences with the following observation. Consider the sequence
L L L ... L R R R ... R
m times n times
Any permutation of this sequence will give back a series of Ls and Rs where the number of Ls is equal to the number of elements in the sequence of length m and the number of Rs is equal to the number of elements in the sequence of length n. We can interpret this sequence as a way of describing how to build up the interleaving - any time we see L, we insert an element from the left sequence, and any time we see an R, we insert an element from the right sequence. For example, consider the sequences 4, 1, 0, 3 and 2, 7. Then the permutation LLRLRL gives the sequence
4, 1, 2, 0, 3, 7
L L R L R L
If we start off with a sequence of m L's and n R's, the number of distinct permutations comes back as
(m + n)!
-------- = (m + n) choose m
m! n!
This holds because there are (m + n)! possible ways of reordering the Ls and the Rs if they were all distinct. Since they aren't, every ordering is counted m! n! too many times because we can permute the L's indistinguishably and the R's indistinguishably. Another way to think about this is to consider the set {1, 2, 3, ..., m + n} of indices in the interleaving, then to choose m of them to fill with elements from the first sequence, implicitly filling the rest of them with elements from the right sequence.
Okay... we now have a way of determining the number of ways of interleaving two sequences of length m and n. Therefore, we have the following:
# ways = (# interleaves of SL and SR) × (# possible SLs) × (# possible SRs)
= ((m + n) choose n) × (# possible SLs) × (# possible SRs)
Where m is the number of elements in the left subtree and n is the number of elements in the right subtree. Yay!
We can therefore write out pseudocode for this algorithm:
function countInsertionOrderings(Tree t) {
if t is null, return 1;
otherwise:
let m = numElementsIn(t.left);
let n = numElementsIn(t.right);
return choose(m + n, n) *
countInsertionOrderings(t.left) *
countInsertionOrderings(t.right);
}
This algorithm performs a total of O(n) multiplications, where n is the number of nodes in the tree, and visits every node exactly once (assuming the number of elements in the BST are cached at each node). This does not mean the algorithm runs in time O(n), though, because the work required to multiply these numbers together will grow rapidly as the numbers get larger and larger.
Hope this helps!
This is an interesting question. I implemented this idea in python and this recursion plus memorization has pretty good performance. The "seq" is the input list of unique integers
def answer(seq):
from math import factorial
BStore = dict() # store binomsial coefficient
Store = dict() # store the recusion step
def TreeGenerator(a,L): # for a given number a and a list L, this functions returns the left tree (a list) and the right tree (a list)
LT = []
RT = []
for i in L:
if i<a:
LT.append(i)
else:
RT.append(i)
solution = [LT,RT]
return(solution)
def binomial_coefficient(n, k):
d = n - k
if d < 0:
return(0)
return(factorial(n) / factorial(k) / factorial(d))
def binomial(n, k):
if (n, k) in BStore:
return(BStore[n, k])
else:
solution = binomial_coefficient(n, k)
BStore[n, k] = solution
return(solution)
def R(S): # recursion starts here
if tuple(S) in Store:
return(Store[tuple(S)])
else:
if len(S)<2:
results = 1
else:
a = S[0]
S = S[1:]
Tree = TreeGenerator(a,S)
R1 = R(Tree[0])
R2 = R(Tree[1])
results = binomial(len(R1)+len(R2), len(R2))*R1*R2
Store[tuple(S)] = results
return(results)
return(R(seq))