I am using the following code:
url = \"http://finance.yahoo.com/q/op?s=DIA&m=2013-07\"
library(XML)
tabs = readHTMLTable(url, stringsAsFactors = F)
It's difficult to know for sure since I can't replicate your error, but according the package's author (see http://comments.gmane.org/gmane.comp.lang.r.mac/2284), XML's methods for getting web content are pretty minimalistic. A workaround is to use RCurl
to get the content and XML
to parse it:
library(XML)
library(RCurl)
url <- "http://finance.yahoo.com/q/op?s=DIA&m=2013-07"
tabs <- getURL(url)
tabs <- readHTMLTable(tabs, stringsAsFactors = F)
Or, if RCurl
still throws an error, try the httr
package:
library(httr)
tabs <- GET(url)
tabs <- readHTMLTable(rawToChar(tabs$content), stringsAsFactors = F)
I just got the same error as above "failed to load external entity" when using url <- "http://www.cisco.com/c/en/us/products/a-to-z-series-index.html" doc <- htmlTreeParse(url, useInternal=TRUE)
I came across this and another post on the topic, which didn't solve my problem. This code worked before. I then realized that I was on corporate VPN. I got off the VPN and tried again and it worked. So, being on VPN might be another reason why you would get the above error. Getting off VPN solves it.