Algorithm for finding all of the shared substrings of any length between 2 strings, and then counting occurrences in string 2?

前端未结

关注

 4  510

慢半拍i 2020-12-30 04:03

I\'ve run into an unusual challenge and so far I\'m unable to determine the most efficient algorithm to attack this.

Given the following 2 strings as a

4条回答

囚心锁ツ (楼主)

2020-12-30 05:00

From what I can understand, breaking up the string to all possible sub-strings is in itself an O(n*n) operation.

abcd ==== a,b,c,d ab,bc,cd abc,bcd abcd ************************ abcdefgh ======== a,b,c,d,e,f,g,h ab,bc,cd,de,ef,fg,gh abc,bcd,cde,def,efg,fgh abcd,bcde,cdef,defg,efgh abcde,bcdef,cdefg,defgh abcdef,bcdefg,cdefgh abcdefg,bcdefgh abcdefgh

As such, it doesn't look like a solution in linear time is possible.

Further more to actually solve it, from a Java language perspective, you'd have to first break it up and store it in a set or a map (map can have substring as key and the number of occurrences as count).

Then repeat the step for the second string as well.

Then you can iterate over the first, checking if the entry exists in the second string's map and also increment the number of occurrences for that sub-string in parallel.

If you are using 'C', then you can try sorting the array of sub-strings and then use binary search to find matches (while having a two-dimensional array to keep track of the string and the count of occurrences).

You said you had a tree approach that ran faster. Do you mind posting a sample so as to how you used a tree ? Was it for representing the sub-strings or to help generate it?

0 讨论(0)

查看其它4个回答

发布评论:

提交评论

加载中...

验证码

看不清?

提交回复