php scrapping and outputting a specific value or number in a given tag

前端 未结 2 1371
逝去的感伤
逝去的感伤 2021-01-28 14:31

so I\'m very new to php. But with some help, I\'ve figured out how to scrape a site if it has a tag identifier like h1 class=____

And even better, I\'ve figured out how

相关标签:
2条回答
  • 2021-01-28 14:41

    If you want to use third party library you can use https://github.com/rajanrx/php-scrape

    <?php
    
    use Scraper\Scrape\Crawler\Types\GeneralCrawler;
    use Scraper\Scrape\Extractor\Types\MultipleRowExtractor;
    
    require_once(__DIR__ . '/../vendor/autoload.php');
    date_default_timezone_set('UTC');
    
    // Create crawler
    $crawler = new GeneralCrawler('https://coinmarketcap.com/');
    
    // Setup configuration
    $configuration = new \Scraper\Structure\Configuration();
    $configuration->setTargetXPath('//table[@id="currencies"]');
    $configuration->setRowXPath('.//tbody/tr');
    $configuration->setFields(
        [
            new \Scraper\Structure\TextField(
                [
                    'name'  => 'Name',
                    'xpath' => './/td[2]/a',
                ]
            ),
            new \Scraper\Structure\TextField(
                [
                    'name'  => 'Market Cap',
                    'xpath' => './/td[3]',
                ]
            ),
            new \Scraper\Structure\RegexField(
                [
                    'name'  => '% Change',
                    'xpath' => './/td[7]',
                    'regex' => '/(.*)%/'
                ]
            ),
        ]
    );
    
    // Extract  data
    $extractor = new MultipleRowExtractor($crawler, $configuration);
    $data = $extractor->extract();
    print_r($data);
    

    will print out following:

    Array
    (
        [0] => Array
            (
                [Name] => Bitcoin
                [Market Cap] => $42,495,710,233
                [% Change] => -1.09
                [hash] => 76faae07da1d2f8c1209d86301d198b3
            )
    
        [1] => Array
            (
                [Name] => Ethereum
                [Market Cap] => $28,063,517,955
                [% Change] => -8.10
                [hash] => 18ade4435c69b5116acf0909e174b497
            )
    
        [2] => Array
            (
                [Name] => Ripple
                [Market Cap] => $11,483,663,781
                [% Change] => -2.73
                [hash] => 5bf61e4bb969c04d00944536e02d1e70
            )
    
        [3] => Array
            (
                [Name] => Litecoin
                [Market Cap] => $2,263,545,508
                [% Change] => -3.36
                [hash] => ea205770c30ddc9cbf267aa5c003933e
            )
       and so on ...
    

    I hope that helps you :)

    Disclaimer: I am author of this library.

    0 讨论(0)
  • 2021-01-28 14:52

    if you only care about that change percentage, try this and remove the whole foreach section:

    $query = "//tr[@id='id-ethereum']/td[contains(@class, 'percent-24h')]";
    $entries = $xpath->query($query);
    
    echo $entries->item(0)->getAttribute('data-usd'); //-5.15
    

    here are the rest of the columns:

    $xpath = new DOMXPath($doc);
    
    $market_cap = $xpath->query("//tr[@id='id-ethereum']/td[contains(@class, 'market-cap')]");
    echo $market_cap->item(0)->getAttribute('data-usd'); //30574084827.1
    
    
    $price = $xpath->query("//tr[@id='id-ethereum']/td/a[contains(@class, 'price')]");
    echo $price->item(0)->getAttribute('data-usd'); //329.567
    
    $circulating_supply = $xpath->query("//tr[@id='id-ethereum']/td/a[@data-supply]");
    echo $circulating_supply->item(0)->getAttribute('data-supply'); //92770467.9991
    
    
    $volume = $xpath->query("//tr[@id='id-ethereum']/td/a[contains(@class, 'volume')]");
    echo $volume->item(0)->getAttribute('data-usd'); //810454000.0
    
    
    $percent_change = $xpath->query("//tr[@id='id-ethereum']/td[contains(@class, 'percent-24h')]");
    echo $percent_change->item(0)->getAttribute('data-usd'); //-3.79
    
    0 讨论(0)
提交回复
热议问题