How to convert an HTML R object to character?

谁说胖子不能爱 提交于 2020-01-23 02:36:12

问题


Here's my reproducible example:

library(rvest)
page <- html("http://google.com")
class(page)
page
> as.character(page)
Error in as.vector(x, "character") : 
  cannot coerce type 'externalptr' to vector of type 'character'

How can I convert page from an html class to a character vector so I can store it somewhere?

The html functions like html_text or html_attr don't give me the whole source. I would like to store it so I can later re-load it with html().

Thanks.


回答1:


To save directly to a text file:

capture.output(page, file="file.html")

To store as a string:

htmltxt <- paste(capture.output(page, file=NULL), collapse="\n")



回答2:


Or, you can just use saveXML from the XML package to handle the HTML/XML object directly without other machinations.

library(rvest)
library(XML)

pg <- html("http://dds.ec/")
saveXML(pg, "output.html")



回答3:


Replace as.character(page) with as(page, "character")



来源:https://stackoverflow.com/questions/29058279/how-to-convert-an-html-r-object-to-character

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!