get url table into a `data.frame` R-XML-RCurl

二次信任 提交于 2019-12-08 08:41:18

问题


I'm trying to get the table of an url into a data.frame. In other examples I found the following code worked:

library(XML)
library(RCurl)
theurl <- "https://es.finance.yahoo.com/q/cp?s=BEL20.BR"
tables <- readHTMLTable(theurl)

As the warning says the table doesn't seem to be XML

Warning message: XML content does not seem to be XML: 'https://es.finance.yahoo.com/q/cp?s=BEL20.BR'

Alternatively, getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R") works but don't know how to extract the table. Any help would be appreciated.

EDIT: thanks to @har07 using table <- readHTMLTable(getURLContent(theurl, ssl.verifypeer = FALSE, useragent = "R"))$ yfncsumtab gives the output but still have to be filtered.


回答1:


You can get the table if you use getURL to get the document content. Sometimes readHTMLTable has trouble getting content. In those cases, it is recommended to try getURL

> library(XML)
> library(RCurl)
> URL <- getURL("https://es.finance.yahoo.com/q/cp?s=BEL20.BR")
> rt <- readHTMLTable(URL, header = TRUE)
> rt

You might need to adjust the header argument and possibly others, but the tables are there.



来源:https://stackoverflow.com/questions/25947566/get-url-table-into-a-data-frame-r-xml-rcurl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!