Find the longest word given a collection

前端 未结 9 1576
既然无缘
既然无缘 2021-01-30 14:03

It is a google interview question and I find most answers online using HashMap or similar data structure. I am trying to find a solution using Trie if possible. Anybody could gi

相关标签:
9条回答
  • 2021-01-30 14:55

    The first thing to note is that you can completely ignore the letter order.

    Have a trie (well, sort of a trie) as follows:

    • From the root, have 26 children (maximum), one for each letter.
    • From each non-root node have children equal to the number of letters greater or equal to the node's letter.
    • Have each node store all words that can be made using (exactly) the letters in the path from the root.

    Build the trie like this:

    For each word, sort the letters of this word and insert the sorted letters into the trie (by creating a path of these letters from the root), creating all required nodes as you go. And store the word at the final node.

    How to do a look-up:

    For a given set of letters, lookup all subsets of letter (most of which hopefully won't exist) and output the words at each node encountered.

    Complexity:

    O(k!), where k is the number of supplied letters. Eek! But luckely the less words there are in the trie, the less of the paths will exist and the less time this will take. And k is the number of supplied letters (which should be relatively small), not the number of words in the trie.

    Actually it may be more along the lines of O(min(k!,n)), which looks a lot better. Note that if you're given enough letters, you'll have to look up all words, thus you have to do O(n) work in the worst case, so, in terms of the worst case complexity, you can't do much better.

    Example:

    Input:

    aba
    b
    ad
    da
    la
    ma
    

    Sorted:

    aab
    b
    ad
    ad
    al
    am
    

    Trie: (just showing non-null children)

         root
         /  \
        a    b
     /-/|\-\
    a b d l m
    |
    b
    

    Lookup of adb:

    • From the root...
    • Go to child a
      • Go to child b
        • No children, return
      • Go to child d
        • Output words at node - ad and da
        • No children, return
      • All letters processed, return
    • Go to child b
      • Output words at node - b
      • Not looking for a child, as only children >= b exists
      • No d child, return
    • No d child, stop
    0 讨论(0)
  • 2021-01-30 14:56

    I suspect a Trie-based implementation wouldn't be very space-efficient, but it would parallelize very nicely, because you could descend into all branches of the tree in parallel and collected the deepest nodes which you can reach from each top branch with the given set of letters. In the end, you just collect all the deepest nodes and select the longest one.

    I'd start with this algorithm (sorry, just pseudo-code), which doesn't attempt to parallelize but just uses plain old recursion (and backtracking) to find the longest match:

    TrieNode visitNode( TrieNode n, LetterCollection c )
    {
        TreeNode deepestNode = n;
        for each Letter l in c:
            TrieNode childNode = n.getChildFor( l );
    
            if childNode:
                TreeNode deepestSubNode = visitNode( childNode, c.without( l ) );
                if deepestSubNode.stringLength > deepestNode.stringLength:
                    deepestNode = deepestSubNode;
       return deepestNode;
    }
    

    I.e. this function is supposed to start at the root node of the trie, with the entire given letter collection. For each letter in the collection, you try to find a child node. If there is one, you recurse and remove the letter from the collection. At one point your letter collection will be empty (best case, all letters consumes - you could actually bail out right away without continueing to traverse the trie) or there will be no more children with any of the remaining letters - in that case you remove the node itself, because that's your "longest match".

    This could parallelize quite nicely if you changed the recursion step so that you visit all children in parallel, collect the results - and select the longest result and return that.

    0 讨论(0)
  • 2021-01-30 14:56

    I tried to code this problem in C++ ..where i created my own hash key and go through all the combination with the given characters.

    Going through all the combination from these input characters from the largest length to 1

    Here is my solution

    #include "iostream"
    #include <string>
    
    using namespace std;
    
    int hash_f(string s){
            int key=0;
            for(unsigned int i=0;i<s.size();i++){
               key += s[i];
            }
            return key;
    }
    
    class collection{
    
    int key[100];
    string str[10000];
    
    public: 
    collection(){
        str[hash_f( "abacus")] = "abacus"; 
        str[hash_f( "deltoid")] = "deltoid"; 
        str[hash_f( "gaff")] = "gaff"; 
        str[hash_f( "giraffe")] = "giraffe"; 
        str[hash_f( "microphone")] = "microphone"; 
        str[hash_f( "reef")] = "reef"; 
        str[hash_f( "qar")] = "qar"; 
    }
    
    string  find(int _key){
        return str[_key];
    }
    };
    
    string sub_str(string s,int* indexes,int n ){
        char c[20];
        int i=0;
        for(;i<n;i++){
            c[i] = s[indexes[i]];
        }
        c[i] = 0;
        return string(c);
    }
    
    string* combination_m_n(string str , int m,int n , int& num){
    
        string* result = new string[100];
        int index = 0;
    
        int * indexes = (int*)malloc(sizeof(int)*n);
    
        for(int i=0;i<n;i++){
            indexes[i] = i; 
        }
    
        while(1){
                result[index++] = sub_str(str , indexes,n);
                bool reset = true;
                for(int i=n-1;i>0;i--)
                {
                    if( ((i==n-1)&&indexes[i]<m-1) ||  (indexes[i]<indexes[i+1]-1))
                    {
                        indexes[i]++;
                        for(int j=i+1;j<n;j++) 
                            indexes[j] = indexes[j-1] + 1;
                        reset = false;
                        break;
                    }
                }
                if(reset){
                    indexes[0]++;
                    if(indexes[0] + n > m) 
                        break;
                    for(int i=1;i<n;i++)
                        indexes[i] = indexes[0]+i;
                }
        }
        num = index;
        return result;
    }
    
    
    int main(int argc, char* argv[])
    {
        string str = "aeffgirq";
        string* r;
        int num;
    
        collection c;
        for(int i=8;i>0;i--){
            r = combination_m_n(str, str.size(),i ,num);
            for(int i=0;i<num;i++){
                int key = hash_f(r[i]);
                 string temp = c.find(key);
                if(  temp != "" ){
                      cout << temp ;
                }
            }
        }
    }
    
    0 讨论(0)
提交回复
热议问题