rcurl | 易学教程

Web Scraping Basketball Reference using R

阅读更多关于 Web Scraping Basketball Reference using R

问题 I'm interested in extracting the player tables on basketball-reference.com. I have successfully extracted the per game statistics table for a specific player (i.e. LeBron James, as an example), which is the first table listed on the web page. However, there are 10+ tables on the page that I can't seem to extract. I've been able to get the table into R a couple different ways. First, using the rvest package: library(rvest) lebron <- "https://www.basketball-reference.com/players/j/jamesle01

Rcurl with http data post

阅读更多关于 Rcurl with http data post

问题 I would like to move the following curl call to Rcurl: curl 'http://myserver.org/stream' -H 'Authorization: Basic XXXXXXXX' -H 'Connection: keep-alive' --data-binary '{"limit": 20}' -H 'Content-Type: application/json;charset=UTF-8' This is one of my R tests: library(RCurl) url.opts <- list(httpheader = list(Authorization ="Basic XXXXXXXX", Connection = "keep-alive", "Content-Type" = "application/json;charset=UTF-8")) getURLContent('http://myserver.org/stream', .opts=url.opts) Now I am missing

R - form web scraping with rvest

阅读更多关于 R - form web scraping with rvest

问题 First I'd like to take a moment and thank the SO community, You helped me many times in the past without me needing to even create an account. My current problem involves web scraping with R. Not my strong point. I would like to scrap http://www.cbs.dtu.dk/services/SignalP/ what I have tried: library(rvest) url <- "http://www.cbs.dtu.dk/services/SignalP/" seq <- "MTSKTCLVFFFSSLILTNFALAQDRAPHGLAYETPVAFSPSAFDFFHTQPENPDPTFNPCSESGCSPLPVAAKVQGASAKAQESDIVSISTGTRSGIEEHGVVGIIFGLAFAVMM" session <-

Compiling RCurl from source on Windows

阅读更多关于 Compiling RCurl from source on Windows

问题 As there is no Windows binary for v1.95-4.3 yet, I need to install/compile the RCurl package from source on Windows 8.1. (64 bit). Could someone tell me Which version of cURL I need Where the various files such as libcurl.dll etc. need to be placed exactly Does the x64 directory mentioned in the error message below correspond to the x64 directory of the R installation? > install.packages("RCurl", type="source") trying URL 'http://cran.rstudio.com/src/contrib/RCurl_1.95-4.3.tar.gz' Content

Assigning a value in exception handling in R

阅读更多关于 Assigning a value in exception handling in R

问题 while(bo!=10){ x = tryCatch(getURLContent(Site, verbose = F, curl = handle), error = function(e) { cat("ERROR1: ", e$message, "\n") Sys.sleep(1) print("reconntecting...") bo <- bo+1 print(bo) }) print(bo) if(bo==0) bo=10 } I wanted to try reconnecting each second after the connection failed. But the new assignment of the bo value is not effective. How can i do that? Or if you know how to reconnect using RCurl options (I really didn't find a thing) it would be amazing. Every help is

unable to install R-package from github

阅读更多关于 unable to install R-package from github

问题 I'm trying to install a flowIncubator package from github (link to the package: https://github.com/RGLab/flowIncubator). I'm using R version 3.3.1 (2016-06-21). I've tried this code: devtools::install_github("RGLab/flowIncubator") & get this error: Error in curl::curl_fetch_disk(url, x$path, handle = handle) : Timeout was reached > traceback() 12: .Call(R_curl_fetch_disk, url, handle, path, "wb", nonblocking) 11: curl::curl_fetch_disk(url, x$path, handle = handle) 10: request_fetch.write_disk

httr and Accept-Encoding: gzip, deflate

阅读更多关于 httr and Accept-Encoding: gzip, deflate

问题 I want to do the following call in R: curl "http://www.openml.org/api/v1/task/list/limit/3000?api_key=c1994bdb7ecb3c6f3c8f3b35f4b47f1f" -H "Accept-Encoding: gzip, deflate" The line above returns a gzip compressed string. But when I use the httr R-package it seems that "Accept-Encoding: gzip, deflate" is ignored: library(httr) content = GET(url = "http://www.openml.org/api/v1/task/list/limit/3000?api_key=c1994bdb7ecb3c6f3c8f3b35f4b47f1f", add_headers(`Accept-Encoding` = "gzip, deflate")) 回答1:

reading raw data in R to be saved as .RData file using the dropbox api

阅读更多关于 reading raw data in R to be saved as .RData file using the dropbox api

问题 Having worked out the oauth signature approval system, for dropbox, I wanted to download an .RData file that I had saved there, using the API, and httr 's GET function. The request was sucessfull and comes back with data, but it is in a raw format, and was wondering how do I go about converting it into an RData file again on my local drive. This is what I've done so far:... require(httr) db.file.name <- "test.RData" db.app <- oauth_app("db",key="xxxxx", secret="xxxxxxx") db.sig <- sign_oauth1

httr GET function running out of space when downloading a large file

阅读更多关于 httr GET function running out of space when downloading a large file

问题 i'm trying to download a file that's 1.1 gigabytes with httr but i'm hitting the following error: x <- GET( extract.path ) Error in curlPerform(curl = handle$handle, .opts = curl_opts$values) : cannot allocate more space: 1728053248 bytes my C drive has 400GB free.. in the RCurl package, i see the maxfilesize and maxfilesize.large options when using getCurlOptionsConstants() but i don't understand if/how these might be passed to httr through config or set_config .. or if i need to switch over

login to mediawiki using RCurl

阅读更多关于 login to mediawiki using RCurl

问题 how can I login to a mediawiki with RCurl (or Curl, and I can adapt it to the R package)? I just want to parse a page but I need to login otherwise I can't access it. 回答1: The Mediawiki API has a login function which returns cookies and a token. You have to save and send both back to the API in order to authenticate the session and login. Here's a way to do it with curl and XMLstarlet in bash: Send a request for a login token, saving the cookies in cookies.txt and the output in output.xml.