most efficient way to find partial string matches in large file of strings (python)

前端 未结 3 612
眼角桃花
眼角桃花 2021-01-12 19:46

I downloaded the Wikipedia article titles file which contains the name of every Wikipedia article. I need to search for all the article titles that may be a possible match.

3条回答
  •  挽巷
    挽巷 (楼主)
    2021-01-12 20:08

    Greg's answer is good if you want to match on individual words. If you want to match on substrings you'll need something a bit more complicated, like a suffix tree (http://en.wikipedia.org/wiki/Suffix_tree). Once constructed, a suffix tree can efficiently answer queries for arbitrary substrings, so in your example it could match "Ice_Hockey" when someone searched for "hock".

提交回复
热议问题