I want to select all the divs which have BOTH A and B as class attributes.
The following selection
soup.findAll(\'div\', class_=[\'A\', \'B\'])
table = soup.find_all("tr",class_=["odd","even"])
Try this way! Make sure you are using proper structure of those quotes and braces. It confused me.
1 some tag like:
<span class="A B C D">XXXX</span>
if you want to use CSS selector to get the tag, you can write the code for the class attribute as following:
spans = beautifulsoup.select('span.A.B.C.D')
2 And if you want to use this for id attribute, you change as following:
<span id="A">XXXX</span>
change the symbol you use in select function:
span = beautifulsoup.select('span#A')
What we learn is that its grammer is like the CSS3
Use css selectors
instead:
soup.select('div.A.B')
for latest BeautifulSoup
, you can use regex to search class
code:
import re
from bs4 import BeautifulSoup
multipleClassHtml = """
<div class="A B">only A and B</div>
<div class="A B">class contain space</div>
<div class="A B C D">except A and B contain other class</div>
<div class="A C D">only A</div>
<div class="B D">only B</div>
<div class=" D E F">no A B</div>
"""
soup = BeautifulSoup(multipleClassHtml, 'html.parser')
bothABClassP = re.compile("A\s+B", re.I)
foundAllAB = soup.find_all("div", attrs={"class": bothABClassP})
print("foundAllAB=%s" % foundAllAB)
output:
foundAllAB=[<div class="A B">only A and B</div>, <div class="A B">class contain space</div>, <div class="A B C D">except A and B contain other class</div>]
You can use CSS selectors instead, which is probably the best solution here.
soup.select("div.classname1.classname2")
You could also use a function.
def interesting_tags(tag):
if tag.name == "div":
classes = tag.get("class", [])
return "A" in classes and "B" in classes
soup.find_all(interesting_tags)