发表新帖

发表新帖

Speed up millions of regex replacements in Python 3

后端未结

关注

 9  1209

醉酒成梦 2020-11-22 05:44

I\'m using Python 3.5.2

I have two lists

a list of about 750,000 \"sentences\" (long strings)
a list of about 20,000 \"words\" that I would l

9条回答

攒了一身酷 (楼主)

2020-11-22 06:31

One thing you might want to try is pre-processing the sentences to encode the word boundaries. Basically turn each sentence into a list of words by splitting on word boundaries.

This should be faster, because to process a sentence, you just have to step through each of the words and check if it's a match.

Currently the regex search is having to go through the entire string again each time, looking for word boundaries, and then "discarding" the result of this work before the next pass.

0 讨论(0)

查看其它9个回答
发布评论:

提交评论
- 加载中...

热议问题