Finding multiple attributes within the span tag in Python

后端 未结 2 614
感动是毒
感动是毒 2020-12-10 05:46

There are two values that i am looking to scrape from a website. These are present in the following tags:

4.1
&         


        
相关标签:
2条回答
  • 2020-12-10 06:02

    Probably there is a better way, but it is eluding me at present. It can be done with css selectors like this:

    html = '''<span class="sp starBig">4.1</span>
              <span class="sp starGryB">2.9</span>
              <span class="sp starBig">22</span>'''
    
    soup = bs4.BeautifulSoup(html)
    
    selectors = ['span.sp.starBig', 'span.sp.starGryB']
    result = []
    for s in selectors:
        result.extend(soup.select(s))
    
    0 讨论(0)
  • 2020-12-10 06:19

    As per the docs, assuming Beautiful Soup 4, matching for multiple CSS classes with strings like 'sp starGryB' is brittle and should not be done:

    soup.find_all('span', {'class': 'sp starGryB'})
    # [<span class="sp starGryB">2.9</span>]
    soup.find_all('span', {'class': 'starGryB sp'})
    # []
    

    CSS selectors should be used instead, like so:

    soup.select('span.sp.starGryB')
    # [<span class="sp starGryB">2.9</span>]
    soup.select('span.starGryB.sp')
    # [<span class="sp starGryB">2.9</span>]
    

    In your case:

    items = soup.select('span.sp.starGryB') + soup.select('span.sp.starBig')
    

    or something more sophisticated like:

    items = [i for s in ['span.sp.starGryB', 'span.sp.starBig'] for i in soup.select(s)]
    
    0 讨论(0)
提交回复
热议问题