How to efficiently compare rows in a pandas DataFrame?

前端未结

关注

 4  1843

自闭症患者 2021-02-10 11:28

I have a pandas dataframe containing a record of lightning strikes with timestamps and global positions in the following format:

Index      Date      Time


      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   春和景丽
                                             
                
                
                (楼主)
            
              
              
                2021-02-10 12:06
              

            
            
                        
This is one of those problems that seems easy initially but the more your think about it the more your head melts! We have essentially got a three-dimensional (Lat, Lon, Time) clustering problem, followed by filtering based on cluster size. There are a number of questions a little like this (though more abstract) and the responses tend to involve scipy. Check out this one. I would also check out fuzzy c-means clustering. Here is the skfuzzy example.

In your case though, the geodesic distance might be key, in which case you might not want to disregard computing distance. The high-maths examples sort of miss the point.

If accuracy is not important there may be more basic ways of doing it, like creating arbitrary time 'bins' using dataframe.cut or similar. There would be an optimum size between speed and accuracy. For instance, if you cut into t/4 bins (1800 seconds), and take a 4 bins gap as being far away in time, then your actual time difference could be 5401-8999. An example of cutting. Applying something similar to the lon and lat co-ordinates, and doing calculations on the approximate values, will be faster.

Hope that helps.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复