scikit learn implementation of tfidf differs from manual implementation
问题 I tried to manually calculate tfidf values using the formula but the result I got is different from the result I got when using scikit-learn implementation. from sklearn.feature_extraction.text import TfidfVectorizer tv = TfidfVectorizer() a = "cat hat bat splat cat bat hat mat cat" b = "cat mat cat sat" tv.fit_transform([a, b]).toarray() # array([[0.53333448, 0.56920781, 0.53333448, 0.18973594, 0. , # 0.26666724], # [0. , 0.75726441, 0. , 0.37863221, 0.53215436, # 0. ]]) tv.get_feature_names