Calculating tf-idf among documents using python 2.7
I have a scenario where i have retreived information/raw data from the internet and placed them into their respective json or .txt files. From there on i would like to calculate the frequecies of each term in each document and their cosine similarity by using tf-idf. For example: there are 50 different documents/texts files that consists 5000 words/strings each i would like to take the first word from the first document/text and compare all the total 250000 words find its frequencies then do so for the second word and so on for all 50 documents/texts. Expected output of each frequecy will be