rcurl

RCurl::url.exists() : how to get non-error for redirects (in the 300 range of HTTP status codes)

天大地大妈咪最大 提交于 2019-12-13 03:10:18
问题 I have a bunch of URLs extracted by text-mining some PDF documents. Now I want to test the URLS for validity. Some urls have junk characters inside or appended, or the URLS are truncated. One approach is to filter them by calling each of them. To do that, I use the url.exists() function from the RCurl package. The function makes HTTP HEAD requests to urls using curl and checks the status code. From the documentation of ?url.exists If ‘.header’ is ‘FALSE’, this returns ‘TRUE’ or ‘FALSE’

RCurl on OS X El Capitan -9806 Error

时光毁灭记忆、已成空白 提交于 2019-12-13 02:47:51
问题 I am trying to use RCurl for an oauth 2 authentication. My code is: library(RCurl) myOpts <- curlOptions(httpheader = c(Accept="application/json", "Content-Type"="application/x-www-form-urlencoded")) token <- postForm(authURL, .params = list( client_id ="aaaaaaaaaaaaaaaaaaaaaaaaaaaaa", client_secret = "bbbbbbbbbbbbbbbbbbbbbbbbbbbb", username = "xxxxx@yyyy.zzz", password = pswd), .opts = myOpts, style="POST") The code works just fine on R/Windows, but not on OS X El Capitan. On OS X, I get the

ntlm proxy authentication rcurl problem

こ雲淡風輕ζ 提交于 2019-12-13 02:19:16
问题 i'm behind an ntlm proxy server and i can't set the rcurl options correctly for it to work. Apparently curl woks fine with the correct settings which are: --proxy-ntlm --proxy_user <...> --proxy <...> but i don't know how to pass all these options correctly from R. I've got as far as: >curl = getCurlHandle() >curlSetOpt( .opts = list(proxy="...:...",proxyuserpwd="...:...",proxyauth="ntlm"),curl = curl) >getURL("http://www.omegahat.org", curl = curl) but this still doesn't seem to do the trick

Download a xls file from url into a dataframe (Rcurl)?

坚强是说给别人听的谎言 提交于 2019-12-12 15:24:33
问题 I'm trying to download the following url into an R dataframe: http://www.fantasypros.com/nfl/rankings/qb.php/?export=xls (It's the 'Export' link on the public page: http://www.fantasypros.com/nfl/rankings/qb.php/) However, I'm not sure how to 'parse' the data? I'm also looking to automate this and perform it weekly, so any thoughts on how to build this into a weekly-access workflow would be greatly appreciated! Have been google searching and scouring stackoverflow for a couple hours now to no

How to `postForm` with header

ⅰ亾dé卋堺 提交于 2019-12-12 13:34:04
问题 How to construct this POST http request using RCurl ? POST http://localhost:7474/db/data/index/node/ Accept: application/json; charset=UTF-8 Content-Type: application/json { "name" : "node_auto_index", "config" : { "type" : "fulltext", "provider" : "lucene" } } I've come up with this in R : require(RCurl) httpheader=c(Accept="application/json; charset=UTF-8", "Content-Type"="application/json") x = postForm("http://localhost:7474/db/data/index/node/", .opts=list(httpheader=httpheader), name=

Login to WordPress using RCurl

吃可爱长大的小学妹 提交于 2019-12-12 12:31:42
问题 I would like to login into a WordPress website using R's RCurl package in order to install a WordPress plugin (probably use postForm on some options pages in WordPress). Since the website is password protected, I ask for your help in how to authenticate my R session. I found the following three links relevant, but do not know how to use them for WordPress: login to mediawiki using RCurl Login Wordpress using HttpWebRequest http://r.789695.n4.nabble.com/RCurl-HTTP-Post-td3311942.html Any

Using RCurl with SFTP

假装没事ソ 提交于 2019-12-12 10:59:11
问题 I'm attempting to use the ftpUpload in the RCurl package for the first time. The site I'm trying to access uses the sftp protocol. I've made sure to install the version of libcurl that includes the ability to make secure connections. SFTP is listed among the protocols available to RCurl: curlVersion()$protocols [1] "dict" "file" "ftp" "ftps" "gopher" [6] "http" "https" "imap" "imaps" "ldap" [11] "pop3" "pop3s" "rtmp" "rtsp" "scp" [16] "sftp" "smtp" "smtps" "telnet" "tftp" Yet, when I run the

Downloading large files with R/RCurl efficiently

白昼怎懂夜的黑 提交于 2019-12-12 07:09:46
问题 I see that many examples for downloading binary files with RCurl are like such: library("RCurl") curl = getCurlHandle() bfile=getBinaryURL ( "http://www.example.com/bfile.zip", curl= curl, progressfunction = function(down, up) {print(down)}, noprogress = FALSE ) writeBin(bfile, "bfile.zip") rm(curl, bfile) If the download is very large, I suppose it would be better writing it concurrently to the storage medium, instead of fetching all in memory. In RCurl documentation there are some examples

R Downloading multiple file from FTP using Rcurl

拈花ヽ惹草 提交于 2019-12-12 05:05:25
问题 I'm a new R. user. I am trying to download 7.000 files(.nc format) from ftp server ( which I got from user and password). On the website, each file is a link to download. I would like to download all the files (.nc). I thank anyone who can help me how to run those jobs in R. Just an example what I have tried to do using Rcurl and a loop and informs me: cannot download all files. library(RCurl) url<- "ftp://ftp.my.link.fr/1234/" userpwd <- userpwd="user:password" destination <- "/Users/ME

Extracting html table with rowspan values

冷暖自知 提交于 2019-12-12 04:15:36
问题 The data frame I create with the following code (using the RCurl and XML packages) puts the three letter team abbreviation into only the first row that it spans. Is there another package or additional code I can add to keep the data in the proper column? library(XML) library(RCurl) url <- "https://en.wikipedia.org/wiki/List_of_Major_League_Baseball_postseason_teams" url_source <- readLines(url, encoding = "UTF-8") playoffs <- data.frame(readHTMLTable(url_source, stringsAsFactors = F, header =