How can I optimize this Python code to generate all words with word-distance 1?

前端未结

关注

 12  858

予麋鹿 2021-01-30 22:11

Profiling shows this is the slowest segment of my code for a little word game I wrote:

def distance(word1, word2):
    difference = 0
    for i in range(len(word


      
      
        
          12条回答        

        
                    
            
            
                         
                
              
              
                
                   时光取名叫无心
                                             
                
                
                (楼主)
            
              
              
                2021-01-30 22:33
              

            
            
                        
How often is the distance function called with the same arguments? A simple to implement optimization would be to use memoization. 

You could probably also create some sort of dictionary with frozensets of letters and lists of words that differ by one and look up values in that. This datastructure could either be stored and loaded through pickle or generated from scratch at startup.

Short circuiting the evaluation will only give you gains if the words you are using are very long, since the hamming distance algorithm you're using is basically O(n) where n is the word length. 

I did some experiments with timeit for some alternative approaches that may be illustrative.

Timeit Results

Your Solution

d = """\
def distance(word1, word2):
    difference = 0
    for i in range(len(word1)):
        if word1[i] != word2[i]:
            difference += 1
    return difference
"""
t1 = timeit.Timer('distance("hello", "belko")', d)
print t1.timeit() # prints 6.502113536776391


One Liner

d = """\
from itertools import izip
def hamdist(s1, s2):
    return sum(ch1 != ch2 for ch1, ch2 in izip(s1,s2))
"""
t2 = timeit.Timer('hamdist("hello", "belko")', d)
print t2.timeit() # prints 10.985101179


Shortcut Evaluation

d = """\
def distance_is_one(word1, word2):
    diff = 0
    for i in xrange(len(word1)):
        if word1[i] != word2[i]:
            diff += 1
        if diff > 1:
            return False
    return diff == 1
"""
t3 = timeit.Timer('hamdist("hello", "belko")', d)
print t2.timeit() # prints 6.63337

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它12个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复