Given a binary tree, I want to find out the largest subtree which is a BST in it.
Naive approach:
I have a naive approach in mind where I visit every node of
One possible solution would be as following -
int maxNodes = INT.MIN;
Node* lb = NULL;
int largesBST(Node* root) {
largesBST(root, INT.MIN, INT.MAX);
}
int largesBST(Node* p, int MIN, int MAX) {
if(!p) { return 0; }
if(p->data > MIN || p->data < MAX) { return -1; }
int lc = largestBST(p->left, p->data, MAX);
if(lc == -1) { return -1; }
int rc = largestBST(p->right, MIN, p->data);
if(rc == -1) { return -1; }
// At this point, cur node is BST
int curNodes = lc + rc + 1;
if(curNodes > maxNodes) {
maxNodes = curNodes;
lb = p;
}
}
I have posted the full solution and explanation in my blog:
http://www.leetcode.com/2010/11/largest-binary-search-tree-bst-in.html
The idea is to do a depth-first traversal and pass min and max values bottom-up instead of top-down. In other words, we verify the deeper nodes before we verify if the above nodes satisfy the BST requirements.
The main reason of doing this is when one of the nodes does not satisfy the BST properties, all subtrees above (which includes this node as well) must also not satisfy the BST requirements.
As compared to the top-down approach, the bottom-up approach is such an awesome choice because the results for total number of nodes could be passed up the tree. This saves us from recalculating over and over again. The total number of nodes for a subtree is simply the total number of nodes of its left and right subtrees plus one.
This algorithm runs in O(N) time complexity and O(1) space, which is as efficient as it can be. (where N is the total number of nodes in the binary tree).
Below is the C++ code that works. The tricky part of the implementation is taking care of how min and max values are passed bottom-up. Please note that I did not initialize min and max values as they are initialized in the bottom of the tree.
// Find the largest BST subtree in a binary tree.
// If the subtree is a BST, return total number of nodes.
// If the subtree is not a BST, -1 is returned.
int findLargestBSTSubtree(BinaryTree *p, int &min, int &max,
int &maxNodes, BinaryTree *& largestBST) {
if (!p) return 0;
bool isBST = true;
int leftNodes = findLargestBSTSubtree(p->left, min, max, maxNodes, largestBST);
int currMin = (leftNodes == 0) ? p->data : min;
if (leftNodes == -1 ||
(leftNodes != 0 && p->data <= max))
isBST = false;
int rightNodes = findLargestBSTSubtree(p->right, min, max, maxNodes, largestBST);
int currMax = (rightNodes == 0) ? p->data : max;
if (rightNodes == -1 ||
(rightNodes != 0 && p->data >= min))
isBST = false;
if (isBST) {
min = currMin;
max = currMax;
int totalNodes = leftNodes + rightNodes + 1;
if (totalNodes > maxNodes) {
maxNodes = totalNodes;
largestBST = p;
}
return totalNodes;
} else {
return -1; // This subtree is not a BST
}
}
BinaryTree* findLargestBSTSubtree(BinaryTree *root) {
BinaryTree *largestBST = NULL;
int min, max;
int maxNodes = INT_MIN; // INT_MIN is defined in <climits>
findLargestBSTSubtree(root, min, max, maxNodes, largestBST);
return largestBST;
}
I guess the problem you're trying to solve is finding the largest (with more nodes) BST in BT. In that case you'll need to traverse all the tree nodes checking if it is a BST, and once you find one you'll have to check if it has more nodes than the largest one found till the moment.
class TreeNode
{
public int value;
public TreeNode left;
public TreeNode right;
}
void LargestBST(TreeNode bt, IDictionary<TreeNode, bool> isBST, IDictionary<TreeNode, int> nodeCount, ref TreeNode largestBST)
{
if (bt == null)
return;
if (IsBST(bt, isBST) && (largestBST == null || NodeCount(bt, nodeCount) > NodeCount(largestBST, nodeCount))
largestBST = bt;
else
{
LargestBST(bt.left, isBST, nodeCount, ref largestBST);
LargestBST(bt.Right, isBST, nodeCount, ref largestBST);
}
}
bool IsBST(TreeNode node, IDictionary<TreeNode, bool> isBST)
{
if (node == null)
return true;
bool result;
if (!isBST.TryGetValue(node, out result))
{
TreeNode maxLeft = Max(node.Left);
TreeNode minRight = Min(node.Right);
result = (maxLeft == null || maxLeft.value <= node.value) &&
(minRight == null || minRight.value >= node.value) &&
IsBST(node.left, isBST) && IsBST(node.Right, isBST);
isBST.Add(node, result);
}
return result;
}
TreeNode Max(TreeNode node)
{
if (node == null)
return null;
while (node.right != null)
node = node.right;
return node;
}
TreeNode Min(TreeNode node)
{
if (node == null)
return null;
while (node.left != null)
node = node.left;
return node;
}
int NodeCount(TreeNode node, IDictionary<TreeNode, int> nodeCount)
{
if (node == null)
return 0;
int result;
if (!nodeCount.TryGetValue(node, out result))
{
result = 1 + NodeCount(node.left, nodeCount) + NodeCount(node.right, nodeCount);
nodeCount.Add(node, result);
}
return result;
}
I would think you could avoid checking if every node is the root of a BST by working top down instead of bottom up. If a subtree is a BST it's going to be larger than any subtree in itself so you don't need to check inside a subtree if it has passed the isBST test. Then you just have isBST return the size of a valid tree and store that along with a pointer to the root of the subtree if you need to be able to find it again instead of just knowing how large the largest one is.
UPDATE:
Some of the code posted here to check if something is a BST are going to fail some cases since they're only checking the parent of a node.
Take for example:
10 / \ 4 99 / 2
This is not a valid BST, (the 2 is out of position with regards to the 10) but if you don't send a min and max value down through the tree you will incorrectly verify it as valid. This pseudocode takes that into account.
main
{
Verify(root, MIN_VALUE, MAX_VALUE)
}
boolean Verify(node , min, max)
{
if(node == null)
return true;
if(node.value > min &&
node.value < max &&
Verify(node.leftchild, min, node.value) &&
Verify(node.rightchild,node.value,max)
{
return true;
}
else
{
return false;
}
}
This isn't the most optimal approach but you could do an inorder traversal of the Binary tree and store it in an array and then find the longest contiguous increasing sequence, that will give you the BST with the maximum number of nodes. The complexity would be O(n)
for the traversal and O(n)
for the search so it's still going to be O(n)
.
To solve the above problem: