How to find elements by class

后端 未结 17 1446
有刺的猬
有刺的猬 2020-11-22 08:33

I\'m having trouble parsing HTML elements with \"class\" attribute using Beautifulsoup. The code looks like this

soup = BeautifulSoup(sdata)
mydivs = soup.fi         


        
相关标签:
17条回答
  • 2020-11-22 08:50

    This worked for me:

    for div in mydivs:
        try:
            clazz = div["class"]
        except KeyError:
            clazz = ""
        if (clazz == "stylelistrow"):
            print div
    
    0 讨论(0)
  • 2020-11-22 08:51

    The following should work

    soup.find('span', attrs={'class':'totalcount'})
    

    replace 'totalcount' with your class name and 'span' with tag you are looking for. Also, if your class contains multiple names with space, just choose one and use.

    P.S. This finds the first element with given criteria. If you want to find all elements then replace 'find' with 'find_all'.

    0 讨论(0)
  • 2020-11-22 08:52

    You can refine your search to only find those divs with a given class using BS3:

    mydivs = soup.findAll("div", {"class": "stylelistrow"})
    
    0 讨论(0)
  • 2020-11-22 08:52

    CSS selectors

    single class first match

    soup.select_one('.stylelistrow')
    

    list of matches

    soup.select('.stylelistrow')
    

    compound class (i.e. AND another class)

    soup.select_one('.stylelistrow.otherclassname')
    soup.select('.stylelistrow.otherclassname')
    

    Spaces in compound class names e.g. class = stylelistrow otherclassname are replaced with ".". You can continue to add classes.

    list of classes (OR - match whichever present

    soup.select_one('.stylelistrow, .otherclassname')
    soup.select('.stylelistrow, .otherclassname')
    

    bs4 4.7.1 +

    Specific class whose innerText contains a string

    soup.select_one('.stylelistrow:contains("some string")')
    soup.select('.stylelistrow:contains("some string")')
    

    Specific class which has a certain child element e.g. a tag

    soup.select_one('.stylelistrow:has(a)')
    soup.select('.stylelistrow:has(a)')
    
    0 讨论(0)
  • 2020-11-22 08:53

    A straight forward way would be :

    soup = BeautifulSoup(sdata)
    for each_div in soup.findAll('div',{'class':'stylelist'}):
        print each_div
    

    Make sure you take of the casing of findAll, its not findall

    0 讨论(0)
  • 2020-11-22 08:53

    How to find elements by class

    I'm having trouble parsing html elements with "class" attribute using Beautifulsoup.

    You can easily find by one class, but if you want to find by the intersection of two classes, it's a little more difficult,

    From the documentation (emphasis added):

    If you want to search for tags that match two or more CSS classes, you should use a CSS selector:

    css_soup.select("p.strikeout.body")
    # [<p class="body strikeout"></p>]
    

    To be clear, this selects only the p tags that are both strikeout and body class.

    To find for the intersection of any in a set of classes (not the intersection, but the union), you can give a list to the class_ keyword argument (as of 4.1.2):

    soup = BeautifulSoup(sdata)
    class_list = ["stylelistrow"] # can add any other classes to this list.
    # will find any divs with any names in class_list:
    mydivs = soup.find_all('div', class_=class_list) 
    

    Also note that findAll has been renamed from the camelCase to the more Pythonic find_all.

    0 讨论(0)
提交回复
热议问题