In any languages, Can I capture a webpage and save it image file? (no install, no activeX)

后端 未结 3 1718
长情又很酷
长情又很酷 2021-01-17 07:53

I heard it is possible to capture webpages by using PHP(maybe above 6.0) on windows server.

I got some sample code and tested. but there are no code to perform righ

相关标签:
3条回答
  • 2021-01-17 08:14

    Though you have asked for a PHP solution, I would like to share yet another solution with Perl. WWW::Mechanize along with LWP::UserAgent and HTML::Parser can help in screen scraping.

    Some documents for reference:

    • Web scraping with WWW::Mechanize
    • Screen-scraping with WWW::Mechanize
    0 讨论(0)
  • 2021-01-17 08:22

    you could use the browsershots api http://browsershots.org/

    with the xml-rpc interface you really could use almost any language to access it.

    http://api.browsershots.org/xmlrpc/

    0 讨论(0)
  • 2021-01-17 08:26

    Downloading the html of a web page is commonly known as screen scraping. This can be useful if you want a program to extract data from a given page. The easiest way to request HTTP resources is to use a tool call cURL. cURL comes as a stand alone unix tool, but there are libraries to use it in about every programming language. To capture this page from the Unix command line type:

    curl http://stackoverflow.com/questions/1077970/in-any-languages-can-i-capture-a-webpageno-install-no-activex-if-i-can-plz
    

    In PHP, you can do the same:

    <?php 
    $ch = curl_init() or die(curl_error()); 
    curl_setopt($ch, CURLOPT_URL,"http://stackoverflow.com/questions/1077970/in-any-languages-can-i-capture-a-webpageno-install-no-activex-if-i-can-plz"); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    $data1=curl_exec($ch) or die(curl_error()); 
    echo "<font color=black face=verdana size=3>".$data1."</font>"; 
    echo curl_error($ch); 
    curl_close($ch); 
    ?>
    

    Now before copying an entire website, you should check their robots.txt file to see if they allow robots to spider their site, and you may want to check if there is an API available which allows you to retrieve the data without the HTML.

    0 讨论(0)
提交回复
热议问题