Building LDAvis plots using phrase tokens instead of single word tokens
问题 My question is very simple. How can one build ldavis's frequentist topic modeling plots with phrase tokens instead of single-word tokens using the text2vec package in R. Currently, the word tokenizer tokens = word_tokenizer(tokens) works great but is there a phrase or ngram tokenizer functionality to enable building ldavis topic models and corresponding plots with phrases instead of words? If not, how might such a code be constructed? Is this even methodologically sound or advisable? 来源: