Does NLTK have TF-IDF implemented?

一笑奈何 提交于 2019-11-29 13:32:17

The NLTK TextCollection class has a method for computing the tf-idf of terms. The documentation is here, and the source is here. However, it says "may be slow to load", so using scikit-learn may be preferable.

Nikita Astrakhantsev

I guess, there are enough evidences to conclude non-existence of TF-IDF in NLTK:

  1. Unfortunately, calculating tf-idf is not available in NLTK so we'll use another data analysis library, scikit-learn

    from COMPSCI 290-01 Spring 2014 lab

  2. More important, source code contains nothing related to tfidf (or tf-idf). Exceptions are NLTK-contrib, which contains map-reduce implementation for TF-IDF.

There are several libs for tf-idf mentioned in related question.

Upd: search by tf idf or tf_idf lets to find the function already found by @yvespeirsman

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!