How can I screen scrape with Perl?

前端 未结 10 800
夕颜
夕颜 2020-12-13 23:28

I need to display some values that are stored in a website, for that I need to scrape the website and fetch the content from the table. Any ideas?

相关标签:
10条回答
  • 2020-12-13 23:47

    Check out this little example of web scraping with perl: link text

    0 讨论(0)
  • 2020-12-13 23:48

    If you are familiar with jQuery you might want to check out pQuery, which makes this very easy:

    ## print every <h2> tag in page
    use pQuery;
    
    pQuery("http://google.com/search?q=pquery")
        ->find("h2")
        ->each(sub {
            my $i = shift;
            print $i + 1, ") ", pQuery($_)->text, "\n";
        });
    

    There's also HTML::DOM.

    Whatever you do, though, don't use regular expressions for this.

    0 讨论(0)
  • 2020-12-13 23:48

    I use LWP::UserAgent for most of my screen scraping needs. You can also Couple that with HTTP::Cookies if you need Cookies support.

    Here's a simple example on how to get source.

    use LWP;
    use HTTP::Cookies;
    my $cookie_jar = HTTP::Cookies->new;
    my $browser = LWP::UserAgent->new;
    $browser->cookie_jar($cookie_jar);
    
    $resp = $browser->get("https://www.stackoverflow.com");
    if($resp->is_success) {
       # Play with your source here
       $source = $resp->content;
       $source =~ s/^.*<table>/<table>/i; # this is just an example 
       print $source;                     # not a solution to your problem.
    }
    
    0 讨论(0)
  • 2020-12-13 23:56

    You could also use this simple perl module WEB::Scraper, this is simple to understand and make life easy for me. follow this example for more information.

    http://teusje.wordpress.com/2010/05/02/web-scraping-with-perl/

    0 讨论(0)
提交回复
热议问题