Search Large Text File for Thousands of strings

前端未结

关注

 3  486

无人共我 2021-01-15 03:31

I have a large text file that is 20 GB in size. The file contains lines of text that are relatively short (40 to 60 characters per line). The file is unsorted.

I hav

3条回答

滥情空心 (楼主)

2021-01-15 04:17

Rather than searching 20,000 times for each string separately, you can try to tokenize the input and do lookup in your std::set with strings to be found, it will be much faster. This is assuming your strings are simple identifiers, but something similar can be implemented for strings being sentences. In this case you would keep a set of first words in each sentence and after successful match verify that it's really beginning of the whole sentence with string::find.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...