rcurl

RCurl, basic authentication with API key

醉酒当歌 提交于 2019-12-03 22:36:22
问题 I used to use RCurl, to grab the data that needs login. Now I have to grab the data using api key (as well as userid,password) and it needs a basic authentication (Radian6 api : http://socialcloud.radian6.com/docs/read/Getting_Started) If it doesn't need an authentication, the code will be something like.. getURL("https:// address", userpwd="id:pswd", httpauth = 1L) but I have no idea how to plug in api key for authentication. So far I was able to find examples written in python or Java but

Scraping data off of NBA.com

别来无恙 提交于 2019-12-03 21:51:12
I'm trying to scrape data off roster data from http://stats.nba.com/team/#!/1610612742/ . So far, I've tried RCurl and XML packages and the code I'v tried is as follows: library(RCurl) library(XML) webpage <- getURL("http://stats.nba.com/team/#!/1610612742/") webpage <- readLines(tc <- textConnection(webpage)); pagetree <- htmlTreeParse(webpage, useInternalNodes = TRUE) x <- unlist(xpathApply(pagetree,"//*nba-stat-table_overflow/player",xmlValue)) Content <- gsub(pattern = "([\t\n])", replacement = " ", x = x, ignore.case = TRUE) I believe that my xpathApply function is formatted wrong. What

login to mediawiki using RCurl

房东的猫 提交于 2019-12-03 21:12:11
how can I login to a mediawiki with RCurl (or Curl, and I can adapt it to the R package)? I just want to parse a page but I need to login otherwise I can't access it. The Mediawiki API has a login function which returns cookies and a token. You have to save and send both back to the API in order to authenticate the session and login. Here's a way to do it with curl and XMLstarlet in bash: Send a request for a login token, saving the cookies in cookies.txt and the output in output.xml. curl -c cookies.txt -d "lgname=YOURNAME&lgpassword=YOURPASS&action=login&format=xml" http://your

How to stop execution of RCurl::getURL() if it is taking too long?

喜你入骨 提交于 2019-12-03 16:21:31
Is there a way to tell R or the RCurl package to give up on trying to download a webpage if it exceeds a specified period of time and move onto the next line of code? For example: > library(RCurl) > u = "http://photos.prnewswire.com/prnh/20110713/NY34814-b" > getURL(u, followLocation = TRUE) > print("next line") # programme does not get this far This will just hang on my system and not proceed to the final line. EDIT: Based on @Richie Cotton's answer below, while I can 'sort of' achieve what I want, I don't understand why it takes longer than expected. For example, if I do the following, the

R - posting a login form using RCurl

佐手、 提交于 2019-12-03 15:18:04
问题 I am new to using R to post forms and then download data off the web. I have a question that is probably very easy for someone out there to spot what I am doing wrong, so I appreciate your patience. I have a Win7 PC and Firefox 23.x is my typical browser. I am trying to post the main form that shows up on http://www.aplia.com/ I have the following R script: your.username <- 'username' your.password <- 'password' setwd( "C:/Users/Desktop/Aplia/data" ) require(SAScii) require(RCurl) require(XML

Asynchronous POST Requests - R, using RCurl?

拜拜、爱过 提交于 2019-12-03 15:01:24
问题 I am trying to make async requests to a REST API from R. The below curl command illustrates the parameters that I need to the pass to the api. I'm giving you guys the linux curl command as I'm hoping that will make it clear: curl -v -X POST https://app.example.com/api/ \ -H 'Authorization: somepwd' \ -H "Content-Type: application/json" \ -d {key1: value1, key2: value2} Right now, I'm accomplishing the same thing in R by executing the following: library(httr) library(jsonlite) content(POST(

How to download a file behind a semi-broken javascript asp function with R

℡╲_俬逩灬. 提交于 2019-12-03 11:39:51
问题 I am trying to fix a download automation script that I provide publicly so that anyone can easily download the world values survey with R. On this web page - http://www.worldvaluessurvey.org/WVSDocumentationWV4.jsp - the PDF link "WVS_2000_Questionnaire_Root" easily downloads in firefox and chrome.I cannot figure out how to automate the download with httr or RCurl or any other R package. screenshot below of the chrome internet behavior. That PDF link needs to follow through to the ultimate

Post request using cookies with cURL, RCurl and httr

不想你离开。 提交于 2019-12-03 08:44:16
In Windows cURL I can post a web request similar to this: curl --dump-header cook.txt ^ --data "RURL=http=//www.example.com/r&user=bob&password=hello" ^ --user-agent "Mozilla/5.0" ^ http://www.example.com/login With type cook.txt I get a response similar to this: HTTP/1.1 302 Found Date: Thu, ****** Server: Microsoft-IIS/6.0 SERVER: ****** X-Powered-By: ASP.NET X-AspNet-Version: 1.1.4322 Location: ****** Set-Cookie: Cookie1=; domain=******; expires=****** ****** ****** ****** Cache-Control: private Content-Type: text/html; charset=iso-8859-1 Content-Length: 189 I can manually read cookie lines

R Change IP Address programmatically

纵饮孤独 提交于 2019-12-03 08:02:04
问题 Currently changing user_agent by passing different strings to the html_session() method. Is there also a way to change your IP address on a timer when scraping a website? 回答1: You can use a proxy (which changes your ip) via use_proxy as follows: html_session("you-url", use_proxy("proxy-ip", port)) For more details see: ?httr::use_proxy To check if it is working you can do the following: require(httr) content(GET("https://ifconfig.co/json"), "parsed") content(GET("https://ifconfig.co/json",

Using an API to calculate distance between two airports (two columns) within R?

拈花ヽ惹草 提交于 2019-12-03 07:52:15
I was wondering whether there was a way to compare airport distances(IATA codes). There are some scripts but not is using R. So I tried that with with the API: developer.aero Example data: library(curl) # for curl post departure <- c("DRS","TXL","STR","DUS","LEJ","FKB","LNZ") arrival <- c("FKB","HER","BOJ","FUE","PMI","AYT","FUE") flyID <- c(1,2,3,4,5,6,7) df <- data.frame(departure,arrival,flyID) departure arrival flyID 1 DRS FKB 1 2 TXL HER 2 3 STR BOJ 3 4 DUS FUE 4 5 LEJ PMI 5 6 FKB AYT 6 7 LNZ FUE 7 api<- curl_fetch_memory("https://airport.api.aero/airport/distance/DRS/FUE?user_key