问题
I've tried using Guzzle's docs to set proxy but it's not working. The official Github page for Goutte is pretty dead so can't find anything there.
Anyone know how to set a proxy?
This is what I've tried:
$client = new Client();
$client->setHeader('User-Agent', $user_agent);
$crawler = $client->request('GET', $request, ['proxy' => $proxy]);
回答1:
You thinking rigth, but in Goutte\Client::doRequest(), when create Guzzle client
$guzzleRequest = $this->getClient()->createRequest(
$request->getMethod(),
$request->getUri(),
$headers,
$body
);
options are not passed when create request object. So, if you want to use a proxy, then override the class Goutte\Client, the method doRequest(), and replace this code on
$guzzleRequest = $this->getClient()->createRequest(
$request->getMethod(),
$request->getUri(),
$headers,
$body,
$request->getParameters()
);
Example overriding class:
<?php
namespace igancev\override;
class Client extends \Goutte\Client
{
protected function doRequest($request)
{
$headers = array();
foreach ($request->getServer() as $key => $val) {
$key = implode('-', array_map('ucfirst', explode('-', strtolower(str_replace(array('_', 'HTTP-'), array('-', ''), $key)))));
if (!isset($headers[$key])) {
$headers[$key] = $val;
}
}
$body = null;
if (!in_array($request->getMethod(), array('GET','HEAD'))) {
if (null !== $request->getContent()) {
$body = $request->getContent();
} else {
$body = $request->getParameters();
}
}
$guzzleRequest = $this->getClient()->createRequest(
$request->getMethod(),
$request->getUri(),
$headers,
$body,
$request->getParameters()
);
foreach ($this->headers as $name => $value) {
$guzzleRequest->setHeader($name, $value);
}
if ($this->auth !== null) {
$guzzleRequest->setAuth(
$this->auth['user'],
$this->auth['password'],
$this->auth['type']
);
}
foreach ($this->getCookieJar()->allRawValues($request->getUri()) as $name => $value) {
$guzzleRequest->addCookie($name, $value);
}
if ('POST' == $request->getMethod() || 'PUT' == $request->getMethod()) {
$this->addPostFiles($guzzleRequest, $request->getFiles());
}
$guzzleRequest->getParams()->set('redirect.disable', true);
$curlOptions = $guzzleRequest->getCurlOptions();
if (!$curlOptions->hasKey(CURLOPT_TIMEOUT)) {
$curlOptions->set(CURLOPT_TIMEOUT, 30);
}
// Let BrowserKit handle redirects
try {
$response = $guzzleRequest->send();
} catch (CurlException $e) {
if (!strpos($e->getMessage(), 'redirects')) {
throw $e;
}
$response = $e->getResponse();
} catch (BadResponseException $e) {
$response = $e->getResponse();
}
return $this->createResponse($response);
}
}
And try send request
$client = new \igancev\override\Client();
$proxy = 'http://149.56.85.17:8080'; // free proxy example
$crawler = $client->request('GET', $request, ['proxy' => $proxy]);
回答2:
I have solved this problem =>
$url = 'https://api.myip.com';
$client = new \Goutte\Client;
$client->setClient(new \GuzzleHttp\Client(['proxy' => 'http://xx.xx.xx.xx:8080']));
$get_html = $client->request('GET', $url)->html();
var_dump($get_html);
回答3:
You can set a custom GuzzleClient
and assign it to Goutte
client.
When Guzzle
makes the request through Goutte
uses the default config. That config is passed in the Guzzle
construct.
$guzzle = new \GuzzleHttp\Client(['proxy' => 'http://192.168.1.1:8080']);
$goutte = new \Goutte\Client();
$goutte->setClient($guzzle);
$crawler = $goutte->request($method, $url);
回答4:
You can directly use in Goutte or Guzzle Request
$proxy = 'xx.xx.xx.xx:xxxx';
$goutte = new GoutteClient();
echo $goutte->request('GET', 'https://example.com/', ['proxy' => $proxy])->html();
Use Same method in Guzzle
$Guzzle = new Client();
$GuzzleResponse = $Guzzle->request('GET', 'https://example.com/', ['proxy' => $proxy]);
回答5:
For recent versions use:
Goutte Client instance (which extends Symfony\Component\BrowserKit\HttpBrowser)
use Symfony\Component\HttpClient\HttpClient;
use Goutte\Client;
$client = new Client(HttpClient::create(['proxy' => 'http://xx.xx.xx.xx:80']));
...
来源:https://stackoverflow.com/questions/35806758/setting-proxy-in-goutte