Removing duplicate subtrees from binary tree

前端未结

关注

 3  662

I have to design an algorithm under the additional homework. This algorithm have to compress binary tree by transforming it into DAG by removing repetitive subtrees and redirect

相关标签:

3条回答

温柔的废话

2021-02-15 08:22

I would go with a hashing approach.

A hash for a leaf is its value mod P_1. Hash for a node is (value+hash(left_son)*P_2+hash(right_son)*P_2^2) mod P_1, where P_1, P_2 are primes. If you count those hashes for at least 5 different big prime pairs(by big i mean something near 10^8-10^9, so you can do your math without overflowing), you can safely assume that nodes with same hashes are the same.

Then you can walk the tree, checking sons, first and do your transform. This will work in O(n) time.

NOTE that you can use other hash functions, like (value + hash(left_son)*P_2 + hash(right_son)*P_3) mod P_1, etc.

0 讨论(0)
发布评论:

提交评论
- 加载中...

爱一瞬间的悲伤

2021-02-15 08:32

This happens when constructing oBDDs. The Idea is: put the tree into a canonical form, and construct a hashtable with an entry for every node. Hash function is a function of the node + the hash functions for the left/right child nodes. Complexity is O(N), but only if one can rely on the hashvalues being unique. The final compare (e.g. for Resolving collisions) will still cost o(N*N) for the recursive subtree <--> subtree compare. More on BDDs or the original Bryant paper

The hashfunction I currently use:

#define SHUFFLE(x,n) (((x) << (n))|((x) >>(32-(n))))
/* a node's hashvalue is based on its value
 * and (recursively) on it's children's hashvalues.
 */
#define NODE_HASH2(l,r) ((SHUFFLE((l),5)^SHUFFLE((r),9)))
#define NODE_HASH3(v,l,r) ((0x54321u*(v) ^ NODE_HASH2((l),(r))))

Typical usage:

void node_sethash(NodeNum num)
{
if (NODE_IS_NULL(num)) return;

if (NODE_IS_TERMINAL(num)) switch (nodes[num].var) {
        case 0: nodes[num].hash.hash= HASH_FALSE; break;
        case 1: nodes[num].hash.hash= HASH_TRUE; break;
        case 2: nodes[num].hash.hash= HASH_FALSE^HASH_TRUE; break;
        }
else if (NODE_IS_NAMED(num)) {
        NodeNum f,t;
        f = nodes[num].negative;
        t = nodes[num].positive;
        nodes[num].hash.hash = NODE_HASH3 (nodes[num].var, nodes[f].hash.hash, nodes[t].hash.hash);
        }
return ;
}

Searching the hash table:

NodeNum *hash_hnd(NodeNum num, int want_exact)
{
unsigned slot;
NodeNum *ptr, this;
if (NODE_IS_NULL(num)) return NULL;

slot = nodes[num].hash.hash % COUNTOF(hash_nodes);

for (ptr = &hash_nodes[slot]; !NODE_IS_NULL(this= *ptr); ptr = &nodes[this].hash.link) {
        if (this == num) break;
        if (want_exact) continue;
        if (nodes[this].hash.hash != nodes[num].hash.hash) continue;
        if (nodes[this].var != nodes[num].var) continue;
        if (node_compare( nodes[this].negative , nodes[num].negative)) continue;
        if (node_compare( nodes[this].positive , nodes[num].positive)) continue;
                /* duplicate node := same var+same children */
        break;
        }
return ptr;
}

The recursive compare function:

int node_compare(NodeNum one, NodeNum two)
{
int rc;

if (one == two) return 0;

if (NODE_IS_NULL(one) && NODE_IS_NULL(two)) return 0;
if (NODE_IS_NULL(one) && !NODE_IS_NULL(two)) return -1;
if (!NODE_IS_NULL(one) && NODE_IS_NULL(two)) return 1;

if (NODE_IS_TERMINAL(one) && !NODE_IS_TERMINAL(two)) return -1;
if (!NODE_IS_TERMINAL(one) && NODE_IS_TERMINAL(two)) return 1;

if (VAR_RANK(nodes[one].var)  < VAR_RANK(nodes[two].var) ) return -1;
if (VAR_RANK(nodes[one].var)  > VAR_RANK(nodes[two].var) ) return 1;


rc = node_compare(nodes[one].negative,nodes[two].negative);
if (rc) return rc;
rc = node_compare(nodes[one].positive,nodes[two].positive);
if (rc) return rc;

return 0;
}

0 讨论(0)

终归单人心

2021-02-15 08:35
This is a problem commonly solved to do common sub-expression elimination in programming languages.

The approach is as follows (and is easily generalized to more than 2 children in a node):

Algorithm (Assumes mutable tree structure; You can easily build a new tree along the way):
```
MakeDAG(tree):

    HASH = a new hash-table-based dictionary

    foreach subtree NODE in the tree // traverse this however you like

        if NODE is in HASH
            replace NODE with HASH[NODE]
        else
            HASH[NODE] = N // insert the current node, N, in the dictionary
```
To compute the hash code for a node, you need to recursively compute the hash nodes until you reach the leaves of the tree.

Simply calculating these hash codes naively will bump up your runtime to O(n^2).

It is crucial that you store the results on your way down the tree to avoid repeated recursive calls and to improve the runtime to O(n).
0 讨论(0)
发布评论:

提交评论
- 加载中...