I\'m building a backend and trying to crunch the following problem.
2000
characters on average)
You should try a string search / pattern matching algorithm. Most famous algorithm for you task is the Aho-Corasick there is a python library for it (of the top of google search)
Most of the pattern matching / string search algorithms will require you to convert your "bag of words/phrases" into a trie.