How to compute similarity in quanteda between documents for adjacent years only, within groups?
问题 I have a diachronic corpus with texts for different organizations, each for years 1969 to 2019. For each organization, I want to compare text for year 1969 and text for 1970, 1970 and 1971, etc. Texts for some years are missing. In other words, I have a corpus, cc, which I converted to a dfm Now I want to use textstat_simil : ncsimil <- textstat_simil(dfm.cc, y = NULL, selection = NULL, margin = "documents", method = "jaccard", min_simil = NULL) This compares every text with every other text,