I searched the forums but I couldn\'t figure it out if the other similar questions are necessarily related to this one.
What I\'m trying to do is match a subtree to a t
It appears that what I was looking for was an algorithm to solve the "tree inclusion problem". I have found some useful documents:
Fast Algorithms for Finding Nearest Common Ancestors
The Tree Inclusion Problem: In Optimal Space and Faster
Tree Isomorphism And Related Problems
I translated one of the algorithms from the last paper into C# (returns the number of pairs in a largest matching between first-level subtrees of a and b).
public static int Match(Node a, Node b, NodeSimilarityComparer comp) {
if (!comp.Equals(a, b))
return 0;
int m = a.SubtreeCount;
int n = b.SubtreeCount;
var matrix = new int[m + 1, n + 1];
for (int i = 0; i <= m; ++i)
matrix[i, 0] = 0;
for (int j = 0; j <= n; ++j)
matrix[0, j] = 0;
for (int i = 1; i <= m; ++i)
for (int j = 1; j <= n; ++j) {
var ai = a.GetSubtree(i - 1);
var bj = b.GetSubtree(j - 1);
var match = Match(ai, bj, comp);
matrix[i, j] = Math.Max(Math.Max(matrix[i, j - 1], matrix[i - 1, j]), matrix[i - 1, j - 1] + match);
}
return matrix[m, n] + 1;
}
Hope this helps others too.
Here are a few references that may help you:
Aho's Efficient Tree Pattern Matching: an Aid to Code Generation
Stanford's Tree pattern matching programs
From IMECS 2011: Tree Pattern Matching Algorithm Using a Succinct Data Structure