Estimate Dictionary size using Zipf’s Law
问题 How would one go about Calculating the Dictionary Size(no.of unique words) of a collection using Zipfs Law? 回答1: You will have to tokenize your collection, e.g. by white-space and punctuation. Then you store all the tokens in a hash and count. What you do is then plot the distribution of the counts using a tool like Gnuplot . 来源: https://stackoverflow.com/questions/47543798/estimate-dictionary-size-using-zipf-s-law