How to Calculate cosine similarity with tf-idf using Lucene and Java
问题 I have a query and a set of documents. I need to rank these documents based on the cosine similarity with tf-idf. Can someone please tell me what support I can get from Lucene to compute this ? What parameters I can directly calculate from Lucene (can I get tf, idf directly through some method in lucene?) and how to compute cosine similarity with Lucene (is there any function which directly returns cosine similarity if I pass two vectors of the query and the document ?) Thanx in advance 回答1: