Using twitteR through a proxy server

痞子三分冷 提交于 2019-12-08 02:09:58

问题


I'm trying to download Twitter data using the twitteR package.

I keep getting the error message

"Error in function (type, msg, asError = TRUE) : couldn't connect to host"

I believe this is because I'm doing this on my work computer and I need to pass the details of the proxy server.

To test this, I tried an example given in one of the answers to a similar question about Proxy Setting for R.

If I enter:

library("RCurl")
getURL("http://stackoverflow.com")

Then I get the same error message as when I try to use twitteR:

"Error in function (type, msg, asError = TRUE) : couldn't connect to host"

However if I pass the details of my proxy server, then it works no problem:

library("RCurl")
opts <- list(
  proxy         = "123.456.7.89", 
  proxyusername = "tumbledown", 
  proxypassword = "mypassword",
  proxyport     = 8080
)
getURL("http://stackoverflow.com", .opts = opts)

However, I'm having an issue with passing the details of my proxy server to twitteR. I've tried setting it in R's Rprofile.site file using:

http_proxy="http://tumbledown:mypassword@123.456.7.89:8080/"

But it doesn't seem to do anything to solve the problem. Where am I going wrong?

Edit 1: Here's the code I'm trying to run, which now I look at it makes me realise this is probably more of an ROAuth issue:

library("twitteR")
library("ROAuth")
library("RCurl")

Credentials <- OAuthFactory$new(
  consumerKey = "MY_CONSUMER_KEY",
  consumerSecret = "MY_CONSUMER_SECRET",
  requestURL = "https://api.twitter.com/oauth/request_token",
  authURL = "https://api.twitter.com/oauth/authorize",
  accessURL = "https://api.twitter.com/oauth/access_token")


# I have then tried both of the below handshake methods:

# 1
Credentials$handshake()

# 2
download.file(url="http://curl.haxx.se/ca/cacert.pem", destfile="cacert.pem")
Credentials$handshake(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"))

EDIT 2:

The following codes seems to get me part way there. If I set these options then I can begin the handshake process with Twitter (intermittently, it still fails sometimes).

options(RCurlOptions = list(
    verbose = TRUE,
    proxy ="http://123.456.7.89:8080",
    proxyuserpwd="tumbledown:mypassword",
    proxyauth="ntlm"))

I then get asked to enter a pin from Twitter after following a URL (which I have to laboriously type in as for some reason it won't let me copy/paste it). I then seem to get part way through the handshake before it fails to complete. here's the verbose output (some details removed/altered):

* About to connect() to proxy 123.456.7.89 port 8080 (#0)
*   Trying 123.456.7.89... * connected
* Connected to 123.456.7.89 (123.456.7.89) port 8080 (#0)
* Establish HTTP proxy tunnel to api.twitter.com:443
> CONNECT api.twitter.com:443 HTTP/1.1
Host: api.twitter.com:443
Proxy-Connection: Keep-Alive

< HTTP/1.1 407 Proxy Authentication Required ( Forefront TMG requires authorization to fulfill the request. Access to the Web Proxy filter is denied.  )
< Via: 1.1 ORG-TMG1
< Proxy-Authenticate: Negotiate
< Proxy-Authenticate: Kerberos
< Proxy-Authenticate: NTLM
< Connection: close
< Proxy-Connection: close
< Pragma: no-cache
< Cache-Control: no-cache
< Content-Type: text/html
< Content-Length: 719   
< 
* Ignore 719 bytes of response-body
* Received HTTP code 407 from proxy after CONNECT
* About to connect() to proxy 123.456.7.89 port 8080 (#0)
*   Trying 123.456.7.89... * connected
* Connected to 123.456.7.89 (123.456.7.89) port 8080 (#0)
* Establish HTTP proxy tunnel to api.twitter.com:443
* Proxy auth using NTLM with user 'ORG\tumbledown'
> CONNECT api.twitter.com:443 HTTP/1.1
Host: api.twitter.com:443
Proxy-Authorization: NTLM <LOTS OF RANDOM LETTERS>==
Proxy-Connection: Keep-Alive

< HTTP/1.1 407 Proxy Authentication Required ( Access is denied.  )
< Via: 1.1 ORG-TMG1
< Proxy-Authenticate: NTLM <LOTS OF RANDOM LETTERS>==
< Connection: Keep-Alive
< Proxy-Connection: Keep-Alive
< Pragma: no-cache
< Cache-Control: no-cache
< Content-Type: text/html
< Content-Length: 0     
< 
* Establish HTTP proxy tunnel to api.twitter.com:443
* Proxy auth using NTLM with user 'ORG\tumbledown'
> CONNECT api.twitter.com:443 HTTP/1.1
Host: api.twitter.com:443
Proxy-Authorization: NTLM <LOTS OF RANDOM LETTERS>=
Proxy-Connection: Keep-Alive

< HTTP/1.1 200 Connection established
< Via: 1.1 ORG-TMG1
< Connection: Keep-Alive
< Proxy-Connection: Keep-Alive
< 
* Proxy replied OK to CONNECT request
* successfully set certificate verify locations:
*   CAfile: \\ORG-nas/tumbledown/R/win-library/2.15/RCurl/CurlSSL/cacert.pem
  CApath: none
* SSL connection using RC4-SHA
* Server certificate:
*    subject: C=US; ST=California; L=San Francisco; O=Twitter, Inc.; OU=Twitter Security; CN=api.twitter.com
*    start date: 2013-04-08 00:00:00 GMT
*    expire date: 2013-12-31 23:59:59 GMT
*    subjectAltName: api.twitter.com matched
*    issuer: C=US; O=VeriSign, Inc.; OU=VeriSign Trust Network; OU=Terms of use at https://www.verisign.com/rpa (c)09; CN=VeriSign Class 3 Secure Server CA - G2
*    SSL certificate verify ok.
> POST /oauth/access_token HTTP/1.1
Host: api.twitter.com
Accept: */*
Content-Length: 297
Content-Type: application/x-www-form-urlencoded

< HTTP/1.1 200 OK
< cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0
< content-length: 160
< content-type: text/html; charset=utf-8
< date: Tue, 23 Apr 2013 11:47:21 GMT
< etag: "<LOTS OF RANDOM LETTERS>"
< expires: Tue, 31 Mar 1981 05:00:00 GMT
< last-modified: Tue, 23 Apr 2013 11:47:21 GMT
< pragma: no-cache
< server: tfe
< set-cookie: _twitter_sess=<LOTS OF RANDOM LETTERS>--<LOTS OF RANDOM LETTERS>; domain=.twitter.com; path=/; HttpOnly
< set-cookie: guest_id=<LOTS OF RANDOM LETTERS>; Domain=.twitter.com; Path=/; Expires=Thu, 23-Apr-2015 11:47:21 UTC
< status: 200 OK
< strict-transport-security: max-age=123456789
< vary: Accept-Encoding
< x-frame-options: SAMEORIGIN
< x-mid: <LOTS OF RANDOM LETTERS>
< x-runtime: 0.04538
< x-transaction: <LOTS OF RANDOM LETTERS>
< x-xss-protection: 1; mode=block
< 
* Connection #0 to host 123.456.7.89 left intact
Error: Proxy Authentication Required ( Forefront TMG requires authorization to fulfill the request. Access to the Web Proxy filter is denied.  )

来源:https://stackoverflow.com/questions/16101984/using-twitter-through-a-proxy-server

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!