How to use Goutte

前端 未结 2 1227
难免孤独
难免孤独 2021-02-08 05:55

Issue:
Cannot fully understand the Goutte web scraper.

Request:
Can someone please help me understand or provide code to help

2条回答
  •  深忆病人
    2021-02-08 06:04

    The documentation you want to look at is the Symfony2 DomCrawler.

    Goutte is a client build on top of Guzzle that returns Crawlers every time you request/submit something:

    use Goutte\Client;
    $client = new Client();
    $crawler = $client->request('GET', 'http://www.symfony-project.org/');
    

    With this crawler you can do stuff like get all the P tags inside the body:

    $nodeValues = $crawler->filter('body > p')->each(function (Crawler $node, $i) {
        return $node->text();
    });
    print_r($nodeValues);
    

    Fill and submit forms:

    $form = $crawler->selectButton('sign in')->form(); 
    $crawler = $client->submit($form, array(
            'username' => 'username', 
            'password' => 'xxxxxx'
    ));
    

    A selectButton() method is available on the Crawler which returns another Crawler that matches a button (input[type=submit], input[type=image], or a button) with the given text. [1]

    You click on links or set options, select check-boxes and more, see Form and Link support.

    To get data from the crawler use the html or text methods

    echo $crawler->html();
    echo $crawler->text();
    

提交回复
热议问题