lwp

How to Parse a webpage

北城余情 提交于 2020-01-05 05:37:18
问题 I am attempting to extract the following from the EnviroCanada weather page. I am trying to get for each hour as per the following. Time | Thigh | Tlow | Humidity 7:00 | 23 | 22.9 | 30 Extracted HTML Page: <tr> <td headers="header1" class="text-center vertical-center"> 7:00 </td> <td headers="header2" class="media vertical-center"><span class="pull-left"><img class="media-object" height="35" width="35" src="/weathericons/small/02.png" /></span><div class="visible-xs visible-sm"> <br /> <br />

POST API in Perl using LWP::UserAgent with authentication

时光总嘲笑我的痴心妄想 提交于 2019-12-25 00:36:04
问题 I am trying to use POST method in perl to send information to an API. I would like to call the below api which requires following inputs: URI: https://www.cryptopia.co.nz/api/SubmitTrade Input Parameters are:- Market: The market symbol of the trade e.g. 'DOT/BTC' (not required if 'TradePairId' supplied) TradePairId: The Cryptopia tradepair identifier of trade e.g. '100' (not required if 'Market' supplied) Type: the type of trade e.g. 'Buy' or 'Sell' Rate: the rate or price to pay for the

Perl LWP GET or POST to an SNI SSL URL

坚强是说给别人听的谎言 提交于 2019-12-23 19:01:46
问题 I have a system that sends data to customers using perl LWP. They can choose their URL and whether to POST or GET. A new customer recently complained that the service doesn't work and they suspect it's because their endpoint uses SNI SSL. Looking in the logs, all I see is the error message "(certificate verify failed) (500 read timeout)". Is there any way to tell if this issue is because of their SNI SSL, or something different? I think I can solve the problem by turning off verify_hostname,

Why can't LWP::UserAgent get this site entirely?

非 Y 不嫁゛ 提交于 2019-12-23 17:45:09
问题 It outputs only a few lines from the beginning. #!/usr/bin/perl use strict; use warnings; use LWP::UserAgent; my $ua = LWP::UserAgent->new; my $response = $ua->get('http://www.eurogamer.net/articles/df-hardware-wii-u-graphics-power-finally-revealed'); print $response->decoded_content; 回答1: I ran the following modification: my $response = $ua->get( 'http://www.eurogamer.net/articles/df-hardware-wii-u-graphics-power-finally-revealed' ); say $response->headers->as_string; And saw this: Cache

How may I bypass LWP's URL encoding for a GET request?

南楼画角 提交于 2019-12-23 12:28:46
问题 I'm talking to what seems to be a broken HTTP daemon and I need to make a GET request that includes a pipe | character in the URL. LWP::UserAgent escapes the pipe character before the request is sent. For example, a URL passed in as: https://hostname/url/doSomethingScript?ss=1234&activities=Lec1|01 is passed to the HTTP daemon as https://hostname/url/doSomethingScript?ss=1234&activities=Lec1%7C01 This is correct, but doesn't work with this broken server. How can I override or bypass the

Cookies in perl lwp

本小妞迷上赌 提交于 2019-12-23 03:14:07
问题 I once wrote a simple 'crawler' to download http pages for me in JAVA. Now I'm trying to rewrite to same thing to Perl, using LWP module. This is my Java code (which works fine): String referer = "http://example.com"; String url = "http://example.com/something/cgi-bin/something.cgi"; String params= "a=0&b=1"; HttpState initialState = new HttpState(); HttpClient httpclient = new HttpClient(); httpclient.setState(initialState); httpclient.getParams().setCookiePolicy(CookiePolicy.NETSCAPE);

LWP::UserAgent Can't Post with TLS1.1

青春壹個敷衍的年華 提交于 2019-12-22 07:08:02
问题 Getting 500 handshaker error:443 over https. The host service I am sending XML to does not support TLS 1.2, they do support 1.0 and 1.1. Currently using LWP 6.03 on CentOS 6. Using the code below they claim I am still sending using TLS1.2 use LWP::UserAgent; $ua = LWP::UserAgent->new(ssl_opts => { verify_hostname => 0,SSL_version => 'SSLv23:!TLSv12' }); $req = HTTP::Request->new(GET => 'https://secure-host-server'); $res = $ua->request($req); if ($res->is_success) { print $res->content; }

How can I get the ultimate URL without fetching the pages using Perl and LWP?

蓝咒 提交于 2019-12-22 05:34:19
问题 I'm doing some web scraping using Perl's LWP. I need to process a set of URLs, some of which may redirect (1 or more times). How can I get ultimate URL with all redirects resolved, using HEAD method? 回答1: If you use the fully featured version of LWP::UserAgent, then the response that is returned is an instance of HTTP::Response which in turn has as an attribute an HTTP::Request. Note that this is NOT necessarily the same HTTP::Request that you created with the original URL in your set of URLs

Scripts broke after upgrading LWP “certificate verify failed”

巧了我就是萌 提交于 2019-12-19 05:07:08
问题 I have a lot of scripts, most of them based around WWW::Mechanize that scrape data off of misc hardware that is accessible via HTTPs. After upgrading most of my perl installation and its modules, all scripts using HTTPS:// broke because of "certificate verify failed" This is a result of the fact that the newer versions of LWP does a proper check on the certificate and dies if something doesn't match. In my case, the failed certificate authentication is expected due to the circumstances, so i

How can I extract XML of a website and save in a file using Perl's LWP?

北城以北 提交于 2019-12-14 03:49:38
问题 How can I extract information from a website (http://tv.yahoo.com/listings) and then create an XML file out of it? I want to save it so to parse later and display information using JavaScript? I am quite new to Perl and I have no idea about how to do it. 回答1: Of course. The easiest way would be the Web::Scraper module. What it does is it lets you define scraper objects that consist of hash key names, XPath expressions that locate elements of interest, and code to extract bits of data from