Beautifulsoup multiple class selector

前端 未结 5 1953
青春惊慌失措
青春惊慌失措 2020-12-05 07:20

I want to select all the divs which have BOTH A and B as class attributes.

The following selection

soup.findAll(\'div\', class_=[\'A\', \'B\'])


        
相关标签:
5条回答
  • 2020-12-05 07:45
    table = soup.find_all("tr",class_=["odd","even"])
    

    Try this way! Make sure you are using proper structure of those quotes and braces. It confused me.

    0 讨论(0)
  • 2020-12-05 08:02

    1 some tag like:

    <span class="A B C D">XXXX</span>
    

    if you want to use CSS selector to get the tag, you can write the code for the class attribute as following:

    spans = beautifulsoup.select('span.A.B.C.D')
    

    2 And if you want to use this for id attribute, you change as following:

    <span id="A">XXXX</span>
    

    change the symbol you use in select function:

    span = beautifulsoup.select('span#A')
    

    What we learn is that its grammer is like the CSS3

    0 讨论(0)
  • 2020-12-05 08:03

    Use css selectors instead:

    soup.select('div.A.B')
    
    0 讨论(0)
  • 2020-12-05 08:07

    for latest BeautifulSoup, you can use regex to search class

    code:

    import re
    from bs4 import BeautifulSoup
    
    multipleClassHtml = """
    <div class="A B">only A and B</div>
    <div class="A     B">class contain space</div>
    <div class="A B C D">except A and B contain other class</div>
    <div class="A C D">only A</div>
    <div class="B D">only B</div>
    <div class=" D E F">no A B</div>
    """
    
    soup = BeautifulSoup(multipleClassHtml, 'html.parser')
    
    bothABClassP = re.compile("A\s+B", re.I)
    foundAllAB = soup.find_all("div", attrs={"class": bothABClassP})
    print("foundAllAB=%s" % foundAllAB)
    

    output:

    foundAllAB=[<div class="A B">only A and B</div>, <div class="A    B">class contain space</div>, <div class="A B C D">except A and B contain other class</div>]
    

    vscode debug bs4

    0 讨论(0)
  • 2020-12-05 08:08

    You can use CSS selectors instead, which is probably the best solution here.

    soup.select("div.classname1.classname2")
    

    You could also use a function.

    def interesting_tags(tag):
        if tag.name == "div":
            classes = tag.get("class", [])
            return "A" in classes and "B" in classes
    
    soup.find_all(interesting_tags)
    
    0 讨论(0)
提交回复
热议问题