问题
Here's my reproducible example:
library(rvest)
page <- html("http://google.com")
class(page)
page
> as.character(page)
Error in as.vector(x, "character") :
cannot coerce type 'externalptr' to vector of type 'character'
How can I convert page from an html class to a character vector so I can store it somewhere?
The html functions like html_text or html_attr don't give me the whole source. I would like to store it so I can later re-load it with html().
Thanks.
回答1:
To save directly to a text file:
capture.output(page, file="file.html")
To store as a string:
htmltxt <- paste(capture.output(page, file=NULL), collapse="\n")
回答2:
Or, you can just use saveXML
from the XML
package to handle the HTML/XML object directly without other machinations.
library(rvest)
library(XML)
pg <- html("http://dds.ec/")
saveXML(pg, "output.html")
回答3:
Replace as.character(page)
with as(page, "character")
来源:https://stackoverflow.com/questions/29058279/how-to-convert-an-html-r-object-to-character