Using tidytext and broom but not finding tidier for LDA_VEM

梦想的初衷 提交于 2019-12-04 04:52:24

问题


The tidytext book has examples with a tidier for topicmodels:

library(tidyverse)
library(tidytext)
library(topicmodels)
library(broom)

year_word_counts <- tibble(year = c("2007", "2008", "2009"),
+                            word = c("dog", "cat", "chicken"),
+                            n = c(1753L, 1157L, 1057L))

animal_dtm <- cast_dtm(data = year_word_counts, document = year, term = word, value = n)

animal_lda <- LDA(animal_dtm, k = 5, control = list( seed = 1234))

animal_lda <- tidy(animal_lda, matrix = "beta")

# Console output
Error in as.data.frame.default(x) : 
  cannot coerce class "structure("LDA_VEM", package = "topicmodels")" to a data.frame
In addition: Warning message:
In tidy.default(animal_lda, matrix = "beta") :
  No method for tidying an S3 object of class LDA_VEM , using as.data.frame

Replicating the error which is also seen here but in this instance library(tidytext) is present.

Below is a list of all packages are their corresponding version:

 packageVersion("tidyverse")
 ‘1.2.1’

 packageVersion("tidytext")
 ‘0.1.6’   

 packageVersion("topicmodels")
 ‘0.2.7’  

 packageVersion("broom")
 ‘0.4.3’

Output from function call sessionInfo():

R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] broom_0.4.3       tidytext_0.1.6    forcats_0.2.0     stringr_1.2.0     dplyr_0.7.4       purrr_0.2.4       readr_1.1.1       tidyr_0.8.0      
 [9] tibble_1.4.2      ggplot2_2.2.1     tidyverse_1.2.1   topicmodels_0.2-7

loaded via a namespace (and not attached):
 [1] modeltools_0.2-21 slam_0.1-42       NLP_0.1-11        reshape2_1.4.3    haven_1.1.1       lattice_0.20-35   colorspace_1.3-2  SnowballC_0.5.1  
 [9] stats4_3.4.3      yaml_2.1.16       rlang_0.1.6       pillar_1.1.0      foreign_0.8-69    glue_1.2.0        modelr_0.1.1      readxl_1.0.0     
[17] bindrcpp_0.2      bindr_0.1         plyr_1.8.4        munsell_0.4.3     gtable_0.2.0      cellranger_1.1.0  rvest_0.3.2       psych_1.7.8      
[25] tm_0.7-3          parallel_3.4.3    tokenizers_0.1.4  Rcpp_0.12.15      scales_0.5.0      jsonlite_1.5      mnormt_1.5-5      hms_0.4.1        
[33] stringi_1.1.6     grid_3.4.3        cli_1.0.0         tools_3.4.3       magrittr_1.5      lazyeval_0.2.1    janeaustenr_0.1.5 crayon_1.3.4     
[41] pkgconfig_2.0.1   Matrix_1.2-12     xml2_1.2.0        lubridate_1.7.2   assertthat_0.2.0  httr_1.3.1        rstudioapi_0.7    R6_2.2.2         
[49] nlme_3.1-131      compiler_3.4.3   

回答1:


Deleting .Rhistory and .RData led to correct behaviour.




回答2:


Wow, that is extremely mysterious to me. I am not able to reproduce that error. I installed to all the same versions/etc as you, except that I am on MacOS instead of Windows. I do have tests for the LDA tidiers that run and pass on Windows on Appveyor, so I would expect this to work.

The code you have should work without loading broom, for what it's worth.

library(tidyverse)
library(tidytext)
library(topicmodels)

year_word_counts <- tibble(year = c("2007", "2008", "2009"),
                           word = c("dog", "cat", "chicken"),
                           n = c(1753L, 1157L, 1057L))

animal_dtm <- cast_dtm(data = year_word_counts, document = year, term = word, value = n)

animal_lda <- LDA(animal_dtm, k = 5, control = list( seed = 1234))

class(animal_lda)
#> [1] "LDA_VEM"
#> attr(,"package")
#> [1] "topicmodels"

tidy(animal_lda, matrix = "beta")
#> # A tibble: 15 x 3
#>    topic term                                                beta
#>    <int> <chr>                                              <dbl>
#>  1     1 dog     0.0000000000000000000000000000000000000000000372
#>  2     2 dog     0.0000000000000000000000000000000000000000000372
#>  3     3 dog     0.0000000000000000000000000000000000000000000372
#>  4     4 dog     1.00                                            
#>  5     5 dog     0.0000000000000000000000000000000000000000000372
#>  6     1 cat     0.0000000000000000000000000000000000000000000372
#>  7     2 cat     0.0000000000000000000000000000000000000000000372
#>  8     3 cat     0.0000000000000000000000000000000000000000000372
#>  9     4 cat     0.0000000000000000000000000000000000000000000372
#> 10     5 cat     1.00                                            
#> 11     1 chicken 0.0000000000000000000000000000000000000000000372
#> 12     2 chicken 0.0000000000000000000000000000000000000000000372
#> 13     3 chicken 1.00                                            
#> 14     4 chicken 0.0000000000000000000000000000000000000000000372
#> 15     5 chicken 0.0000000000000000000000000000000000000000000372

Created on 2018-02-14 by the reprex package (v0.2.0).

What happens if you load library(methods) as well?




回答3:


I had the same issue when I loaded the LDA I had saved. Finally, for no apparent reasons when I restarted the R session I worked again.




回答4:


Adding to the very helpful answer provided by Julia Silge:

I too believe that the interaction between loading .Rdata and the topicmodels package is the culprit here. But you can still work with your saved workspace:

I was able to eliminate the problem by starting with a fresh restart of RStudio, loading the topicmodels package and then loading the .Rdata. Done in this sequence, the error message disappears. Loading first the data and then the package does not work.

One more word on workspaces: In the case of LDA, using these along with your RScripts is really the only way I could figure out to work efficiently. Depending on the parameters and the size of the corpus, fitting an LDA-model may take several hours. It is crucial to be able to save the model fit to then do further analyses down the road.



来源:https://stackoverflow.com/questions/48765936/using-tidytext-and-broom-but-not-finding-tidier-for-lda-vem

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!