Using rvest with drake: external pointer is not valid error

倖福魔咒の 提交于 2020-04-30 10:07:28

问题


When I first run the code below, everything is ok. But when I change something in html_file %>%... comand, for example commenting tolower(), I get the following error:

Error: target title failed.
diagnose(title)error$message:
  external pointer is not valid
diagnose(title)error$calls:
   1. └─html_file %>% html_nodes("h2") %>% html_text()

Code:

library(rvest)
library(drake)

some_string <- '
  <div class="main">
      <h2>A</h2>
      <div class="route">X</div>
  </div> 
'

html_file <- read_html(some_string)
title <- html_file %>% 
  html_nodes("h2") %>% 
  html_text()

plan <- drake_plan(
  html_file = read_html(some_string),
  title = html_file %>% 
    html_nodes("h2") %>% 
    html_text() %>% 
    tolower()
)

make(plan)

I found two possible solutions but I'm not enthusiastic about them.
1. Join both steps in drake_plan into one.
2. Use xml2::write_html() and xml2::read_html() as suggested here.
Is there a better way to solve it? P.S. Issue was already discussed here, Rstudio forum, and on github.


回答1:


By default, drake saves targets as RDS files (other options here). So https://github.com/tidyverse/rvest/issues/181#issuecomment-395064636, which you brought up, is exactly the problem. I like (1) because text is compatible with RDS. Speaking broadly, it is up to the user to choose good targets compatible with drake's data storage system. See https://books.ropensci.org/drake/plans.html#how-to-choose-good-targets for a discussion and links to similar issues. But you want to go with (2), you could return the file path to your HTML file from within a dynamic file.



来源:https://stackoverflow.com/questions/61031325/using-rvest-with-drake-external-pointer-is-not-valid-error

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!