I\'m having trouble parsing HTML elements with \"class\" attribute using Beautifulsoup. The code looks like this
soup = BeautifulSoup(sdata)
mydivs = soup.fi
How to find elements by class
I'm having trouble parsing html elements with "class" attribute using Beautifulsoup.
You can easily find by one class, but if you want to find by the intersection of two classes, it's a little more difficult,
From the documentation (emphasis added):
If you want to search for tags that match two or more CSS classes, you should use a CSS selector:
css_soup.select("p.strikeout.body") # []
To be clear, this selects only the p tags that are both strikeout and body class.
To find for the intersection of any in a set of classes (not the intersection, but the union), you can give a list to the class_
keyword argument (as of 4.1.2):
soup = BeautifulSoup(sdata)
class_list = ["stylelistrow"] # can add any other classes to this list.
# will find any divs with any names in class_list:
mydivs = soup.find_all('div', class_=class_list)
Also note that findAll has been renamed from the camelCase to the more Pythonic find_all
.