Find dictionary items whose key matches a substring

前端未结

关注

 5  449

伪装坚强ぢ 2020-12-07 20:28

I have a large dictionary constructed like so:

programs[\'New York\'] = \'some values...\' 
programs[\'Port Authority of New York\'] = \'some values...\' 
pr


      
      
        
          5条回答        

        
                    
            
            
                         
                
              
              
                
                   囚心锁ツ
                                             
                
                
                (楼主)
            
              
              
                2020-12-07 21:10
              

            
            
                        
You could generate all substrings ahead of time, and map them to their respective keys.

#generates all substrings of s.
def genSubstrings(s):
    #yield all substrings that contain the first character of the string
    for i in range(1, len(s)+1):
        yield s[:i]
    #yield all substrings that don't contain the first character
    if len(s) > 1:
        for j in genSubstrings(s[1:]):
            yield j

keys = ["New York", "Port Authority of New York", "New York City"]
substrings = {}
for key in keys:
    for substring in genSubstrings(key):
        if substring not in substrings:
            substrings[substring] = []
        substrings[substring].append(key)


Then you can query substrings to get the keys that contain that substring:

>>>substrings["New York"]
['New York', 'Port Authority of New York', 'New York City']
>>> substrings["of New York"]
['Port Authority of New York']


Pros:


getting keys by substring is as fast as accessing a dictionary.


Cons:


Generating substrings incurs a one-time cost at the beginning of your program, taking time proportional to the number of keys in programs.
substrings will grow approximately linearly with the number of keys in programs, increasing the memory usage of your script.
genSubstrings has O(n^2) performance in relation to the size of your key. For example, "Port Authority of New York" generates 351 substrings.

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它5个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复