Calculating term frequencies in a big corpus efficiently regardless of document boundaries

前端 未结 0 383
鱼传尺愫
鱼传尺愫 2020-12-18 20:05

I have a corpus of almost 2m documents. I want to calculate the term frequencies of the terms in the whole corpus, regardless of document boundaries.

A naive approach

相关标签:
回答
  • 消灭零回复
提交回复
热议问题