Extract class name in scrapy

前端 未结 3 1960
粉色の甜心
粉色の甜心 2021-01-19 11:41

I am trying to scrape rating off of trustpilot.com.

Is it possible to extract a class name using scrapy? I am trying to scrape a rating which is made up of five indi

相关标签:
3条回答
  • 2021-01-19 12:00

    I had a similar question. Using scrapy v1.5.1 I could extract attributes of elements by name. Here is an example used on Lowes; I did the same with the class attribute

        for product in response.css('ul.product-cards-grid li.product-wrapper'):
            prod_href = p.css('li::attr(data-producturl)').extract()
            prod_name = p.css('li::attr(data-producttitle)').extract_first()
            prod_img  = p.css('li::attr(data-productimg)').extract_first()
            prod_id   = p.css('li::attr(data-productid)').extract_first()
    
    0 讨论(0)
  • 2021-01-19 12:10

    You could use a combination of both somewhere in your code:

    import re
    
    classes = response.css('.star-rating').xpath("@class").extract()
    for cls in classes:
        match = re.search(r'\bcount-\d+\b', cls)
        if match:
            print("Class = {}".format(match.group(0))
    
    0 讨论(0)
  • 2021-01-19 12:13

    You can extract rating directly using re_first() and re():

    for rating in response.xpath('//div[contains(@class, "star-rating")]/@class').re(r'count-(\d+)'):
        print(rating)
    
    0 讨论(0)
提交回复
热议问题