Beautiful Soup: 'ResultSet' object has no attribute 'find_all'?

前端未结

关注

 3  1573

I am trying to scrape a simple table using Beautiful Soup. Here is my code:

import requests
from bs4 import BeautifulSoup

url = \'https://gist.githubusercon


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  北恋        
                
              
                            
                2020-11-22 06:04
              
            
            
                                                                       
Iterate   over table and use rowfind_all('td')    

   for row in table:
        col = row.find_all('td')

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘了有多久        
                
              
                            
                2020-11-22 06:10
              
            
            
                                                                       
The table variable contains an array. You would need to call find_all on its members (even though you know it's an array with only one member), not on the entire thing.

>>> type(table)
<class 'bs4.element.ResultSet'>
>>> type(table[0])
<class 'bs4.element.Tag'>
>>> len(table[0].find_all('tr'))
6
>>>

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  攒了一身酷        
                
              
                            
                2020-11-22 06:10
              
            
            
                                                                       

table = soup.find_all(class_='dataframe')



This gives you a result set – i.e. all the elements that match the class. You can either iterate over them or, if you know you only have one dataFrame, you can use find instead. From your code it seems the latter is what you need, to deal with the immediate problem:

table = soup.find(class_='dataframe')


However, that is not all:

for row in table.find_all('tr'):
    col = table.find_all('td')


You probably want to iterate over the tds in the row here, rather than the whole table. (Otherwise you'll just see the first row over and over.)

for row in table.find_all('tr'):
    for col in row.find_all('td'):

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复