How to break down a given text into words from the dictionary?

前端 未结 4 1453
一整个雨季
一整个雨季 2021-02-02 15:02

This is an interview question. Suppose you have a string text and a dictionary (a set of strings). How do you break down text into substri

4条回答
  •  [愿得一人]
    2021-02-02 15:28

    This solution assumes the existence of Trie data structure for the dictionary. Further, for each node in Trie, assumes the following functions:

    1. node.IsWord() : Returns true if the path to that node is a word
    2. node.IsChild(char x): Returns true if there exists a child with label x
    3. node.GetChild(char x): Returns the child node with label x
    Function annotate( String str, int start, int end, int root[], TrieNode node):
    i = start
    while i<=end:
        if node.IsChild ( str[i]):
            node = node.GetChild( str[i] )
            if node.IsWord():
                root[i+1] = start
            i+=1
        else:
            break;
    
    end = len(str)-1
    root = [-1 for i in range(len(str)+1)]
    for start= 0:end:
        if start = 0 or root[start]>=0:
            annotate(str, start, end, root, trieRoot)
    
    index  0  1  2  3  4  5  6  7  8  9  10  11
    str:   t  h  i  s  i  s  a  t  e  x  t
    root: -1 -1 -1 -1  0 -1  4  6 -1  6 -1   7
    

    I will leave the part for you to list the words that make up the string by reverse traversing the root.

    Time complexity is O(nk) where n is the length of the string and k is the length of the longest word in the dictionary.

    PS: I am assuming following words in the dictionary: this, is, a, text, ate.

提交回复
热议问题