Crawling Google Search with PHP

前端 未结 3 848
暖寄归人
暖寄归人 2021-02-02 04:52

I trying to get my head around how to fetch Google search results with PHP or JavaScript. I know it has been possible before but now I can\'t find a way.

I am trying to

相关标签:
3条回答
  • 2021-02-02 05:11

    There is php a github package named google-url that does the job.

    The api is very comfortable to use. See the example :

    // this line creates a new crawler
    $googleUrl=new \GoogleURL\GoogleUrl();
    $googleUrl->setLang('en'); // say for which lang you want to search (it could have been "fr" instead)
    $googleUrl->setNumberResults(10); // how many results you want to check
    // launch the search for a specific keyword
    $results = $googleUrl->search("google crawler");
    // finaly you can loop on the results (an example is also available on the github page)
    

    However you will have to think to use a delay between each query, or else google will consider you as a bot and ask you for a captcha that will lock the script.

    0 讨论(0)
  • 2021-02-02 05:15

    Odd. Because if I do a curl from the command like I get a 200 OK:

    curl -I 'http://www.google.se/#hl=sv&q=dogs'
    HTTP/1.1 200 OK
    Date: Sun, 27 Jan 2013 20:45:02 GMT
    Expires: -1
    Cache-Control: private, max-age=0
    Content-Type: text/html; charset=ISO-8859-1
    Set-Cookie: PREF=ID=b82cb66e9d996c48:FF=0:TM=1359319502:LM=1359319502:S=D-LW-_w8GlMfw-lX; expires=Tue, 27-Jan-2015 20:45:02 GMT; path=/; domain=.google.se
    Set-Cookie: NID=67=XtW2l43TDBuOaOnhWkQ-AeRbpZOiA-UYEcs7BIgfGs41FkHlEegssgllBRmfhgQDwubG3JB0s5691OLHpNmLSNmJrKHKGZuwxCJYv1qnaBPtzitRECdLAIL0oQ0DSkrx; expires=Mon, 29-Jul-2013 20:45:02 GMT; path=/; domain=.google.se; HttpOnly
    P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
    Server: gws
    X-XSS-Protection: 1; mode=block
    X-Frame-Options: SAMEORIGIN
    Transfer-Encoding: chunked
    

    Also, maybe consider setting a urlencode for the passed URL so this line:

    curl_setopt($ch, CURLOPT_URL, 'http://www.google.se/#hl=sv&q=dogs');
    

    Changes to this:

    curl_setopt($ch, CURLOPT_URL, 'http://www.google.se/' . urlencode('#hl=sv&q=dogs'));
    
    0 讨论(0)
  • 2021-02-02 05:27

    I did it earlier. Generate the html contents by making https://www.google.co.in/search?hl=en&output=search&q=india http request, now parse specific tags using the htmldom php library. You can parse the content of result page using PHP SIMPLE HTML DOM

    DEMO : Below code will give you title of all the result :

    <?php
    
    include("simple_html_dom.php");
    
    $html = file_get_html('http://www.google.co.in/search?hl=en&output=search&q=india');
    
    $i = 0;
    foreach($html->find('li[class=g]') as $element) {
        foreach($element->find('h3[class=r]') as $h3) 
        {
            $title[$i] = '<h1>'.$h3->plaintext.'</h1>' ;
        }
           $i++;
    }
    print_r($title);
    
    ?>
    
    0 讨论(0)
提交回复
热议问题