This is an interview question. Suppose you have a string text
and a dictionary
(a set of strings). How do you break down text
into substri
You can solve this problem using Dynamic Programming and Hashing.
Calculate the hash of every word in the dictionary. Use the hash function you like the most. I would use something like (a1 * B ^ (n - 1) + a2 * B ^ (n - 2) + ... + an * B ^ 0) % P, where a1a2...an is a string, n is the length of the string, B is the base of the polynomial and P is a large prime number. If you have the hash value of a string a1a2...an you can calculate the hash value of the string a1a2...ana(n+1) in constant time: (hashValue(a1a2...an) * B + a(n+1)) % P.
The complexity of this part is O(N * M), where N is the number of words in the dictionary and M is the length of the longest word in the dictionary.
Then, use a DP function like this:
bool vis[LENGHT_OF_STRING];
bool go(char str[], int length, int position)
{
int i;
// You found a set of words that can solve your task.
if (position == length) {
return true;
}
// You already have visited this position. You haven't had luck before, and obviously you won't have luck this time.
if (vis[position]) {
return false;
}
// Mark this position as visited.
vis[position] = true;
// A possible improvement is to stop this loop when the length of substring(position, i) is greater than the length of the longest word in the dictionary.
for (i = position; position < length; i++) {
// Calculate the hash value of the substring str(position, i);
if (hashValue is in dict) {
// You can partition the substring str(i + 1, length) in a set of words in the dictionary.
if (go(i + 1)) {
// Use the corresponding word for hashValue in the given position and return true because you found a partition for the substring str(position, length).
return true;
}
}
}
return false;
}
The complexity of this algorithm is O(N * M), where N is the length of the string and M is the length of the longest word in the dictionary or O(N ^ 2), depending if you coded the improvement or not.
So the total complexity of the algorithm will be: O(N1 * M) + O(N2 * M) (or O(N2 ^ 2)), where N1 is the number of words in the dictionary, M is the length of the longest word in the dictionary and N2 is the lenght of the string).
If you can't think of a nice hash function (where there are not any collision), other possible solution is to use Tries or a Patricia trie (if the size of the normal trie is very large) (I couldn't post links for these topics because my reputation is not high enough to post more than 2 links). But in you use this, the complexity of your algorithm will be O(N * M) * O(Time needed to find a word in the trie), where N is the length of the string and M is the length of the longest word in the dictionary.
I hope it helps, and I apologize for my poor english.