Does NLTK have TF-IDF implemented?

后端 未结 2 1054
暗喜
暗喜 2020-12-20 16:13

There are TF-IDF implementations in scikit-learn and gensim.

There are simple implementations Simple implementation of N-Gram, tf-idf and

相关标签:
2条回答
  • 2020-12-20 16:53

    I guess, there are enough evidences to conclude non-existence of TF-IDF in NLTK:

    1. Unfortunately, calculating tf-idf is not available in NLTK so we'll use another data analysis library, scikit-learn

      from COMPSCI 290-01 Spring 2014 lab

    2. More important, source code contains nothing related to tfidf (or tf-idf). Exceptions are NLTK-contrib, which contains map-reduce implementation for TF-IDF.

    There are several libs for tf-idf mentioned in related question.

    Upd: search by tf idf or tf_idf lets to find the function already found by @yvespeirsman

    0 讨论(0)
  • 2020-12-20 16:59

    The NLTK TextCollection class has a method for computing the tf-idf of terms. The documentation is here, and the source is here. However, it says "may be slow to load", so using scikit-learn may be preferable.

    0 讨论(0)
提交回复
热议问题