Efficiently build a graph of words with given Hamming distance

后端 未结 4 678
情话喂你
情话喂你 2021-02-07 04:08

I want to build a graph from a list of words with Hamming distance of (say) 1, or to put it differently, two words are connected if they only differ from one letter (lo

4条回答
  •  暗喜
    暗喜 (楼主)
    2021-02-07 04:35

    There's no need to take a dependency on the alphabet size. Given a word bot, for example, insert it into a dictionary of word lists under the keys ?ot, b?t, bo?. Then, for each word list, connect all pairs.

    import collections
    
    
    d = collections.defaultdict(list)
    with open('/usr/share/dict/words') as f:
        for line in f:
            for word in line.split():
                if len(word) == 6:
                    for i in range(len(word)):
                        d[word[:i] + ' ' + word[i + 1:]].append(word)
    pairs = [(word1, word2) for s in d.values() for word1 in s for word2 in s if word1 < word2]
    print(len(pairs))
    

提交回复
热议问题