Beautifulsoup multiple class selector

前端未结

关注

 5  1953

I want to select all the divs which have BOTH A and B as class attributes.

The following selection

soup.findAll(\'div\', class_=[\'A\', \'B\'])


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  说谎        
                
              
                            
                2020-12-05 07:45
              
            
            
                                                                       
table = soup.find_all("tr",class_=["odd","even"])

Try this way! Make sure you are using proper structure of those quotes and braces. It confused me.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  执笔经年        
                
              
                            
                2020-12-05 08:02
              
            
            
                                                                       
1
some tag like:

<span class="A B C D">XXXX</span>


if you want to use CSS selector to get the tag, you can write the code for the class attribute as following:

spans = beautifulsoup.select('span.A.B.C.D')


2 And if you want to use this for id attribute, you change as following:

<span id="A">XXXX</span>


change the symbol you use in select function:

span = beautifulsoup.select('span#A')


What we learn is that its grammer is like the CSS3
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  清歌不尽        
                
              
                            
                2020-12-05 08:03
              
            
            
                                                                       
Use css selectors instead:

soup.select('div.A.B')

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  佛祖请我去吃肉        
                
              
                            
                2020-12-05 08:07
              
            
            
                                                                       
for latest BeautifulSoup, you can use regex to search class

code:

import re
from bs4 import BeautifulSoup

multipleClassHtml = """
<div class="A B">only A and B</div>
<div class="A     B">class contain space</div>
<div class="A B C D">except A and B contain other class</div>
<div class="A C D">only A</div>
<div class="B D">only B</div>
<div class=" D E F">no A B</div>
"""

soup = BeautifulSoup(multipleClassHtml, 'html.parser')

bothABClassP = re.compile("A\s+B", re.I)
foundAllAB = soup.find_all("div", attrs={"class": bothABClassP})
print("foundAllAB=%s" % foundAllAB)


output:

foundAllAB=[<div class="A B">only A and B</div>, <div class="A    B">class contain space</div>, <div class="A B C D">except A and B contain other class</div>]



                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  猫巷女王i        
                
              
                            
                2020-12-05 08:08
              
            
            
                                                                       
You can use CSS selectors instead, which is probably the best solution here.

soup.select("div.classname1.classname2")


You could also use a function.

def interesting_tags(tag):
    if tag.name == "div":
        classes = tag.get("class", [])
        return "A" in classes and "B" in classes

soup.find_all(interesting_tags)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复