发表新帖

发表新帖

Remove substrings inside a list with better than O(n^2) complexity

后端未结

关注

 4  1504

清酒与你 2021-02-02 18:21

I have a list with many words (100.000+), and what I\'d like to do is remove all the substrings of every word in the list.

So for simplicity, let\'s imagine that I have

4条回答

被撕碎了的回忆 (楼主)

2021-02-02 19:01
@wim is correct.

Given an alphabet of fixed length, the following algorithm is linear in the overall length of text. If the alphabet is of unbounded size, then it will be O(n log(n)) instead. Either way it is better than O(n^2).
```
Create an empty suffix tree T.
Create an empty list filtered_words
For word in words:
    if word not in T:
        Build suffix tree S for word (using Ukkonen's algorithm)
        Merge S into T
        append word to filtered_words
```
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题