问题
So I have this pretty simple cURL code that should be retrieving the data of a page... Well, it actually does show the page's contents... sometimes, and also, most of the other times the style is all messed up, as the fonts aren't loading and neither do most of the images and graphic elements. The results vary if I change the URLs of the page I want to show, with some loading no problem, others not showing anything at all.
I guess there's a problem with how the cURL handles the css, how can I have it load it correctly?
<?php
$ch = curl_init();
$url = 'http://3amigos.com.mx/';
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Googlebot/2.1 (http://www.googlebot.com/bot.html)');
curl_setopt($ch, CURLOPT_FRESH_CONNECT, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 1);
$data = curl_exec($ch);
echo $data;
?>
ABCDEF
As it is right now, sometimes it doesn't even load anything but the ABCDEF without any format, and some other times the page is completely empty.
回答1:
When you send http request with curl, curl take the response and give it to you.
If the response have a styles embedded, you can show this apply with YOU BROWSER NOT ON/WITH CURL.
BUT if the styles/resources have full URI on the attribute src
, href
, you can see apply this ON YOU BROWSER NOT ON/WITH CURL.
Curl is not a interpreter HTML, JS, CSS.
Curl is only command line for transferring data using various protocols like HTTP, HTTPS, ...
You maybe need learn about PhantomJS or Selenium.
Another solution(too slow) is get response and parse it to find all <link>
(css style link) to open and get content and embed it...(I think this is bad idea, but work)
回答2:
You may try this (Check the modified code):
$ch = curl_init();
$url = 'http://3amigos.com.mx/';
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Googlebot/2.1 (http://www.googlebot.com/bot.html)');
curl_setopt($ch, CURLOPT_FRESH_CONNECT, 1);
//curl_setopt($ch, CURLOPT_TIMEOUT, 1); // Removed
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // Added
$data = curl_exec($ch);
if($data) var_dump($data);
The CURLOPT_RETURNTRANSFER
is required to get the result back/returned and CURLOPT_TIMEOUT
is for the maximum number of seconds to allow cURL functions to execute. This will give you some result/text (string format) not the formatted as html.
Here is a truncated screenshot to prove that it works:
来源:https://stackoverflow.com/questions/42354248/curl-not-loading-style-css