问题
I'm trying to download Twitter data using the twitteR package.
I keep getting the error message
"Error in function (type, msg, asError = TRUE) : couldn't connect to host"
I believe this is because I'm doing this on my work computer and I need to pass the details of the proxy server.
To test this, I tried an example given in one of the answers to a similar question about Proxy Setting for R.
If I enter:
library("RCurl")
getURL("http://stackoverflow.com")
Then I get the same error message as when I try to use twitteR:
"Error in function (type, msg, asError = TRUE) : couldn't connect to host"
However if I pass the details of my proxy server, then it works no problem:
library("RCurl")
opts <- list(
proxy = "123.456.7.89",
proxyusername = "tumbledown",
proxypassword = "mypassword",
proxyport = 8080
)
getURL("http://stackoverflow.com", .opts = opts)
However, I'm having an issue with passing the details of my proxy server to twitteR. I've tried setting it in R's Rprofile.site file using:
http_proxy="http://tumbledown:mypassword@123.456.7.89:8080/"
But it doesn't seem to do anything to solve the problem. Where am I going wrong?
Edit 1: Here's the code I'm trying to run, which now I look at it makes me realise this is probably more of an ROAuth issue:
library("twitteR")
library("ROAuth")
library("RCurl")
Credentials <- OAuthFactory$new(
consumerKey = "MY_CONSUMER_KEY",
consumerSecret = "MY_CONSUMER_SECRET",
requestURL = "https://api.twitter.com/oauth/request_token",
authURL = "https://api.twitter.com/oauth/authorize",
accessURL = "https://api.twitter.com/oauth/access_token")
# I have then tried both of the below handshake methods:
# 1
Credentials$handshake()
# 2
download.file(url="http://curl.haxx.se/ca/cacert.pem", destfile="cacert.pem")
Credentials$handshake(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"))
EDIT 2:
The following codes seems to get me part way there. If I set these options then I can begin the handshake process with Twitter (intermittently, it still fails sometimes).
options(RCurlOptions = list(
verbose = TRUE,
proxy ="http://123.456.7.89:8080",
proxyuserpwd="tumbledown:mypassword",
proxyauth="ntlm"))
I then get asked to enter a pin from Twitter after following a URL (which I have to laboriously type in as for some reason it won't let me copy/paste it). I then seem to get part way through the handshake before it fails to complete. here's the verbose output (some details removed/altered):
* About to connect() to proxy 123.456.7.89 port 8080 (#0)
* Trying 123.456.7.89... * connected
* Connected to 123.456.7.89 (123.456.7.89) port 8080 (#0)
* Establish HTTP proxy tunnel to api.twitter.com:443
> CONNECT api.twitter.com:443 HTTP/1.1
Host: api.twitter.com:443
Proxy-Connection: Keep-Alive
< HTTP/1.1 407 Proxy Authentication Required ( Forefront TMG requires authorization to fulfill the request. Access to the Web Proxy filter is denied. )
< Via: 1.1 ORG-TMG1
< Proxy-Authenticate: Negotiate
< Proxy-Authenticate: Kerberos
< Proxy-Authenticate: NTLM
< Connection: close
< Proxy-Connection: close
< Pragma: no-cache
< Cache-Control: no-cache
< Content-Type: text/html
< Content-Length: 719
<
* Ignore 719 bytes of response-body
* Received HTTP code 407 from proxy after CONNECT
* About to connect() to proxy 123.456.7.89 port 8080 (#0)
* Trying 123.456.7.89... * connected
* Connected to 123.456.7.89 (123.456.7.89) port 8080 (#0)
* Establish HTTP proxy tunnel to api.twitter.com:443
* Proxy auth using NTLM with user 'ORG\tumbledown'
> CONNECT api.twitter.com:443 HTTP/1.1
Host: api.twitter.com:443
Proxy-Authorization: NTLM <LOTS OF RANDOM LETTERS>==
Proxy-Connection: Keep-Alive
< HTTP/1.1 407 Proxy Authentication Required ( Access is denied. )
< Via: 1.1 ORG-TMG1
< Proxy-Authenticate: NTLM <LOTS OF RANDOM LETTERS>==
< Connection: Keep-Alive
< Proxy-Connection: Keep-Alive
< Pragma: no-cache
< Cache-Control: no-cache
< Content-Type: text/html
< Content-Length: 0
<
* Establish HTTP proxy tunnel to api.twitter.com:443
* Proxy auth using NTLM with user 'ORG\tumbledown'
> CONNECT api.twitter.com:443 HTTP/1.1
Host: api.twitter.com:443
Proxy-Authorization: NTLM <LOTS OF RANDOM LETTERS>=
Proxy-Connection: Keep-Alive
< HTTP/1.1 200 Connection established
< Via: 1.1 ORG-TMG1
< Connection: Keep-Alive
< Proxy-Connection: Keep-Alive
<
* Proxy replied OK to CONNECT request
* successfully set certificate verify locations:
* CAfile: \\ORG-nas/tumbledown/R/win-library/2.15/RCurl/CurlSSL/cacert.pem
CApath: none
* SSL connection using RC4-SHA
* Server certificate:
* subject: C=US; ST=California; L=San Francisco; O=Twitter, Inc.; OU=Twitter Security; CN=api.twitter.com
* start date: 2013-04-08 00:00:00 GMT
* expire date: 2013-12-31 23:59:59 GMT
* subjectAltName: api.twitter.com matched
* issuer: C=US; O=VeriSign, Inc.; OU=VeriSign Trust Network; OU=Terms of use at https://www.verisign.com/rpa (c)09; CN=VeriSign Class 3 Secure Server CA - G2
* SSL certificate verify ok.
> POST /oauth/access_token HTTP/1.1
Host: api.twitter.com
Accept: */*
Content-Length: 297
Content-Type: application/x-www-form-urlencoded
< HTTP/1.1 200 OK
< cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0
< content-length: 160
< content-type: text/html; charset=utf-8
< date: Tue, 23 Apr 2013 11:47:21 GMT
< etag: "<LOTS OF RANDOM LETTERS>"
< expires: Tue, 31 Mar 1981 05:00:00 GMT
< last-modified: Tue, 23 Apr 2013 11:47:21 GMT
< pragma: no-cache
< server: tfe
< set-cookie: _twitter_sess=<LOTS OF RANDOM LETTERS>--<LOTS OF RANDOM LETTERS>; domain=.twitter.com; path=/; HttpOnly
< set-cookie: guest_id=<LOTS OF RANDOM LETTERS>; Domain=.twitter.com; Path=/; Expires=Thu, 23-Apr-2015 11:47:21 UTC
< status: 200 OK
< strict-transport-security: max-age=123456789
< vary: Accept-Encoding
< x-frame-options: SAMEORIGIN
< x-mid: <LOTS OF RANDOM LETTERS>
< x-runtime: 0.04538
< x-transaction: <LOTS OF RANDOM LETTERS>
< x-xss-protection: 1; mode=block
<
* Connection #0 to host 123.456.7.89 left intact
Error: Proxy Authentication Required ( Forefront TMG requires authorization to fulfill the request. Access to the Web Proxy filter is denied. )
来源:https://stackoverflow.com/questions/16101984/using-twitter-through-a-proxy-server