Profiling shows this is the slowest segment of my code for a little word game I wrote:
def distance(word1, word2):
difference = 0
for i in range(len(word
First thing to occur to me:
from operator import ne
def distance(word1, word2):
return sum(map(ne, word1, word2))
which has a decent chance of going faster than other functions people have posted, because it has no interpreted loops, just calls to Python primitives. And it's short enough that you could reasonably inline it into the caller.
For your higher-level problem, I'd look into the data structures developed for similarity search in metric spaces, e.g. this paper or this book, neither of which I've read (they came up in a search for a paper I have read but can't remember).