I\'ve run into an unusual challenge and so far I\'m unable to determine the most efficient algorithm to attack this.
Given the following 2 strings as a
From what I can understand, breaking up the string to all possible sub-strings is in itself an O(n*n) operation.
abcd
====
a,b,c,d
ab,bc,cd
abc,bcd
abcd
************************
abcdefgh
========
a,b,c,d,e,f,g,h
ab,bc,cd,de,ef,fg,gh
abc,bcd,cde,def,efg,fgh
abcd,bcde,cdef,defg,efgh
abcde,bcdef,cdefg,defgh
abcdef,bcdefg,cdefgh
abcdefg,bcdefgh
abcdefgh
As such, it doesn't look like a solution in linear time is possible.
Further more to actually solve it, from a Java language perspective, you'd have to first break it up and store it in a set or a map (map can have substring as key and the number of occurrences as count).
Then repeat the step for the second string as well.
Then you can iterate over the first, checking if the entry exists in the second string's map and also increment the number of occurrences for that sub-string in parallel.
If you are using 'C', then you can try sorting the array of sub-strings and then use binary search to find matches (while having a two-dimensional array to keep track of the string and the count of occurrences).
You said you had a tree approach that ran faster. Do you mind posting a sample so as to how you used a tree ? Was it for representing the sub-strings or to help generate it?