问题
I'm trying to get the table of an url into a data.frame
. In other examples I found the following code worked:
library(XML)
library(RCurl)
theurl <- "https://es.finance.yahoo.com/q/cp?s=BEL20.BR"
tables <- readHTMLTable(theurl)
As the warning says the table doesn't seem to be XML
Warning message:
XML content does not seem to be XML: 'https://es.finance.yahoo.com/q/cp?s=BEL20.BR'
Alternatively, getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R")
works but don't know how to extract the table. Any help would be appreciated.
EDIT: thanks to @har07 using table <- readHTMLTable(getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R"))$ yfncsumtab
gives the output but still have to be filtered.
回答1:
You can get the table if you use getURL
to get the document content. Sometimes readHTMLTable
has trouble getting content. In those cases, it is recommended to try getURL
> library(XML)
> library(RCurl)
> URL <- getURL("https://es.finance.yahoo.com/q/cp?s=BEL20.BR")
> rt <- readHTMLTable(URL, header = TRUE)
> rt
You might need to adjust the header
argument and possibly others, but the tables are there.
来源:https://stackoverflow.com/questions/25947566/get-url-table-into-a-data-frame-r-xml-rcurl