conditionally replace values in one list using another list of different length and ranges based on %age overlap in python

后端未结

关注

 3  428

予麋鹿 2021-01-14 17:47

One text file \'Truth\' contains these following values :

0.000000    3.810000    Three
3.810000    3.910923    NNNN
3.910923    5.429000    AAAA
5.429000


      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   伪装坚强ぢ
                                             
                
                
                (楼主)
            
              
              
                2021-01-14 18:27
              

            
            
                        
Assuming that the ranges never overlap, that they're ordered, and that the smaller ranges inside test will always fit fully inside the larger ranges of truth.

You can perform a merge similar to the merge in merge sort. Here's a code snippet that should do what you like:

def in_range(truth_item, test_item):
    return truth_item[0] <= test_item[0] and truth_item[1] >= test_item[1]


def update_test_items(truth_items, test_items):
    current_truth_index = 0
    for test_item in test_items:
        while not in_range(truth_items[current_truth_index], test_item):
            current_truth_index += 1
            if current_truth_index >= len(truth_items):
                return

        test_item[2] = truth_items[current_truth_index][2]


update_test_items(truth, test)


Calling update_test_items will modify test by adding in the appropriate values from truth.

Now you can set a condition for update if you like, say 80% coverage and leave the value unchanged if this isn't met.

def has_enough_coverage(truth_item, test_item):
    truth_item_size = truth_item[1] - truth_item[0]
    test_item_size = test_item[1] - test_item[0]
    return test_item_size / truth_item_size >= .8


def in_range(truth_item, test_item):
    return truth_item[0] <= test_item[0] and truth_item[1] >= test_item[1]


def update_test_items(truth_items, test_items):
    current_truth_index = 0
    for test_item in test_items:
        while not in_range(truth_items[current_truth_index], test_item):
            current_truth_index += 1
            if current_truth_index >= len(truth_items):
                return

        if has_enough_coverage(truth_items[current_truth_index], test_item):
            test_item[2] = truth_items[current_truth_index][2]


update_test_items(truth, test)


This will only update the test item if it covers 80%+ of the truth range.

Note that these will only work if the initial assumptions are correct, otherwise you'll run into issues. This approach will also run very efficiently O(N) time.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复