问题
I have a button in web page as
<input class="nextbutton" type="submit" name="B1" value="Next 20>>"></input>
Now i want to check if this button exists on the page or not using Xpath selectors so that if it exists i can go to next page and retreive information from there.
回答1:
First, you have to determine what counts as "this button". Given the context, I'd suggest looking for an input with a class of 'nextbutton'. You could check for an element with only one class like this in XPath:
//input[@class='nextbutton']
But that looks for exact matches only. So you could try:
//input[contains(@class, 'nextbutton')]
Though this will also match "nonextbutton" or "nextbuttonbig". So your final answer will probably be:
//input[contains(concat(' ', @class, ' '), ' nextbutton ')]
In Scrapy, a Selector will evaluate as true if it matches some nonzero content. So you should be able to write something like:
from scrapy.selector import Selector
input_tag = Selector(text=html_content).xpath("//input[contains(concat(' ', @class, ' '), ' nextbutton ')]")
if input_tag:
print "Yes, I found a 'next' button on the page."
回答2:
Due to the Scrapy Selectors documentation , you can use xpath and element property for check element Exists or not.
try this!
isExists = response.xpath("//input[@class='nextbutton']").extract_first(default='not-found')
if( isExists == 'not-found'):
# input Not Exists
pass
else:
# input Exists , crawl other page
pass
回答3:
http://www.trumed.org/patients-visitors/find-a-doctor loads an iframe
with src="http://verify.tmcmed.org/iDirectory/"
<iframe border="0" frameborder="0" id="I1" name="I1"
src="http://verify.tmcmed.org/iDirectory/"
style="width: 920px; height: 600px;" target="I1">
Your browser does not support inline frames or is currently configured not to display inline frames.
</iframe>
The search form is in this iframe.
Here's a scrapy shell session illustrating this:
$ scrapy shell "http://www.trumed.org/patients-visitors/find-a-doctor"
2014-07-10 11:31:05+0200 [scrapy] INFO: Scrapy 0.24.2 started (bot: scrapybot)
2014-07-10 11:31:07+0200 [default] DEBUG: Crawled (200) <GET http://www.trumed.org/patients-visitors/find-a-doctor> (referer: None)
...
In [1]: response.xpath('//iframe/@src').extract()
Out[1]: [u'http://verify.tmcmed.org/iDirectory/']
In [2]: fetch('http://verify.tmcmed.org/iDirectory/')
2014-07-10 11:31:34+0200 [default] DEBUG: Redirecting (302) to <GET http://verify.tmcmed.org/iDirectory/applicationspecific/intropage.asp> from <GET http://verify.tmcmed.org/iDirectory/>
2014-07-10 11:31:35+0200 [default] DEBUG: Redirecting (302) to <GET http://verify.tmcmed.org/iDirectory/applicationspecific/search.asp> from <GET http://verify.tmcmed.org/iDirectory/applicationspecific/intropage.asp>
2014-07-10 11:31:36+0200 [default] DEBUG: Crawled (200) <GET http://verify.tmcmed.org/iDirectory/applicationspecific/search.asp> (referer: None)
...
In [3]: from scrapy.http import FormRequest
In [4]: frq = FormRequest.from_response(response, formdata={'LastName': 'c'})
In [5]: fetch(frq)
2014-07-10 11:32:15+0200 [default] DEBUG: Redirecting (302) to <GET http://verify.tmcmed.org/iDirectory/applicationspecific/SearchStart.asp> from <POST http://verify.tmcmed.org/iDirectory/applicationspecific/search.asp>
2014-07-10 11:32:15+0200 [default] DEBUG: Redirecting (302) to <GET http://verify.tmcmed.org/iDirectory/applicationspecific/searchresults.asp> from <GET http://verify.tmcmed.org/iDirectory/applicationspecific/SearchStart.asp>
2014-07-10 11:32:17+0200 [default] DEBUG: Crawled (200) <GET http://verify.tmcmed.org/iDirectory/applicationspecific/searchresults.asp> (referer: None)
...
In [6]: response.css('input.nextbutton')
Out[6]: [<Selector xpath=u"descendant-or-self::input[@class and contains(concat(' ', normalize-space(@class), ' '), ' nextbutton ')]" data=u'<input type="submit" value=" Next 20 >'>]
In [7]: response.xpath('//input[@class="nextbutton"]')
Out[7]: [<Selector xpath='//input[@class="nextbutton"]' data=u'<input type="submit" value=" Next 20 >'>]
In [8]:
来源:https://stackoverflow.com/questions/24672048/how-to-check-if-a-specific-button-exists-in-scrapy