The tidytext book has examples with a tidier for topicmodels:
library(tidyverse)
library(tidytext)
library(topicmodels)
library(broom)
year_word_counts <- tibble(year = c("2007", "2008", "2009"),
+ word = c("dog", "cat", "chicken"),
+ n = c(1753L, 1157L, 1057L))
animal_dtm <- cast_dtm(data = year_word_counts, document = year, term = word, value = n)
animal_lda <- LDA(animal_dtm, k = 5, control = list( seed = 1234))
animal_lda <- tidy(animal_lda, matrix = "beta")
# Console output
Error in as.data.frame.default(x) :
cannot coerce class "structure("LDA_VEM", package = "topicmodels")" to a data.frame
In addition: Warning message:
In tidy.default(animal_lda, matrix = "beta") :
No method for tidying an S3 object of class LDA_VEM , using as.data.frame
Replicating the error which is also seen here but in this instance library(tidytext)
is
present.
Below is a list of all packages are their corresponding version:
packageVersion("tidyverse")
‘1.2.1’
packageVersion("tidytext")
‘0.1.6’
packageVersion("topicmodels")
‘0.2.7’
packageVersion("broom")
‘0.4.3’
Output from function call sessionInfo()
:
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] broom_0.4.3 tidytext_0.1.6 forcats_0.2.0 stringr_1.2.0 dplyr_0.7.4 purrr_0.2.4 readr_1.1.1 tidyr_0.8.0
[9] tibble_1.4.2 ggplot2_2.2.1 tidyverse_1.2.1 topicmodels_0.2-7
loaded via a namespace (and not attached):
[1] modeltools_0.2-21 slam_0.1-42 NLP_0.1-11 reshape2_1.4.3 haven_1.1.1 lattice_0.20-35 colorspace_1.3-2 SnowballC_0.5.1
[9] stats4_3.4.3 yaml_2.1.16 rlang_0.1.6 pillar_1.1.0 foreign_0.8-69 glue_1.2.0 modelr_0.1.1 readxl_1.0.0
[17] bindrcpp_0.2 bindr_0.1 plyr_1.8.4 munsell_0.4.3 gtable_0.2.0 cellranger_1.1.0 rvest_0.3.2 psych_1.7.8
[25] tm_0.7-3 parallel_3.4.3 tokenizers_0.1.4 Rcpp_0.12.15 scales_0.5.0 jsonlite_1.5 mnormt_1.5-5 hms_0.4.1
[33] stringi_1.1.6 grid_3.4.3 cli_1.0.0 tools_3.4.3 magrittr_1.5 lazyeval_0.2.1 janeaustenr_0.1.5 crayon_1.3.4
[41] pkgconfig_2.0.1 Matrix_1.2-12 xml2_1.2.0 lubridate_1.7.2 assertthat_0.2.0 httr_1.3.1 rstudioapi_0.7 R6_2.2.2
[49] nlme_3.1-131 compiler_3.4.3
Deleting .Rhistory and .RData led to correct behaviour.
Wow, that is extremely mysterious to me. I am not able to reproduce that error. I installed to all the same versions/etc as you, except that I am on MacOS instead of Windows. I do have tests for the LDA tidiers that run and pass on Windows on Appveyor, so I would expect this to work.
The code you have should work without loading broom, for what it's worth.
library(tidyverse)
library(tidytext)
library(topicmodels)
year_word_counts <- tibble(year = c("2007", "2008", "2009"),
word = c("dog", "cat", "chicken"),
n = c(1753L, 1157L, 1057L))
animal_dtm <- cast_dtm(data = year_word_counts, document = year, term = word, value = n)
animal_lda <- LDA(animal_dtm, k = 5, control = list( seed = 1234))
class(animal_lda)
#> [1] "LDA_VEM"
#> attr(,"package")
#> [1] "topicmodels"
tidy(animal_lda, matrix = "beta")
#> # A tibble: 15 x 3
#> topic term beta
#> <int> <chr> <dbl>
#> 1 1 dog 0.0000000000000000000000000000000000000000000372
#> 2 2 dog 0.0000000000000000000000000000000000000000000372
#> 3 3 dog 0.0000000000000000000000000000000000000000000372
#> 4 4 dog 1.00
#> 5 5 dog 0.0000000000000000000000000000000000000000000372
#> 6 1 cat 0.0000000000000000000000000000000000000000000372
#> 7 2 cat 0.0000000000000000000000000000000000000000000372
#> 8 3 cat 0.0000000000000000000000000000000000000000000372
#> 9 4 cat 0.0000000000000000000000000000000000000000000372
#> 10 5 cat 1.00
#> 11 1 chicken 0.0000000000000000000000000000000000000000000372
#> 12 2 chicken 0.0000000000000000000000000000000000000000000372
#> 13 3 chicken 1.00
#> 14 4 chicken 0.0000000000000000000000000000000000000000000372
#> 15 5 chicken 0.0000000000000000000000000000000000000000000372
Created on 2018-02-14 by the reprex package (v0.2.0).
What happens if you load library(methods)
as well?
I had the same issue when I loaded the LDA I had saved. Finally, for no apparent reasons when I restarted the R session I worked again.
Adding to the very helpful answer provided by Julia Silge:
I too believe that the interaction between loading .Rdata and the topicmodels package is the culprit here. But you can still work with your saved workspace:
I was able to eliminate the problem by starting with a fresh restart of RStudio, loading the topicmodels package and then loading the .Rdata. Done in this sequence, the error message disappears. Loading first the data and then the package does not work.
One more word on workspaces: In the case of LDA, using these along with your RScripts is really the only way I could figure out to work efficiently. Depending on the parameters and the size of the corpus, fitting an LDA-model may take several hours. It is crucial to be able to save the model fit to then do further analyses down the road.
来源:https://stackoverflow.com/questions/48765936/using-tidytext-and-broom-but-not-finding-tidier-for-lda-vem