I have a corpus of 5354 news articles, with a variety of duplicates in it. Using the stm package, I ran the stm model for the 906 unique articles and used alignCorpus and fi