I\'m working through Allen Downey\'s How To Think Like A Computer Scientist, and I\'ve written what I believe to be a functionally correct solution to Exercise 10.1
An important thing is your index
function: It's the function that runs more than any function. When you don't need the index of the found word, why define a function to find that index?
if word1word2 in lst:
is enough instead of if index(lst, word1word2):
.
The same for if index(lst, word2word1):
.
OK. bisection works really faster than the in
syntax. To improve the speed a bit more, i suggest using the bisect_left
function directly in your interlockings
function.
For example instead of:
if index(lst, word1word2): # check to see if word1word2 is actually a word
total += 1
print "Word 1: %s, Word 2: %s, Interlock: %s" % (word1, word2, word1word2)
Use:
q = bisect_left(lst, word1word2)
if q != len(lst) and lst[q] == word1word2:
total += 1
print "Word 1: %s, Word 2: %s, Interlock: %s" % (word1, word2, word1word2)
A very slight improvement in speed.
An alternate version:
with open('words.txt') as inf:
words = set(wd.strip() for wd in inf)
word_gen = ((word, word[::2], word[1::2]) for word in words)
interlocked = [word for word,a,b in word_gen if a in words and b in words]
On my machine this runs in 0.16 seconds and returns 1254 words.
Edit: as pointed out by @John Machin at Why is this program faster in Python than Objective-C? this can be further improved by lazy execution (only perform the second slice if the first results in a valid word):
with open('words.txt') as inf:
words = set(wd.strip() for wd in inf)
interlocked = [word for word in words if word[::2] in words and word[1::2] in words]
This drops execution time by a third, to 0.104 seconds.
Alternative definition for interlock:
import itertools
def interlock(str1, str2):
"Takes two strings of equal length and 'interlocks' them."
return ''.join(itertools.chain(*zip(str1, str2)))
Do it the other way around: Iterate through all words and split them into two words by taking the odd and even letters. Then look up those two words in the dictionary.
As a side node, the two words that interlock must not necessarily have the same length -- the lengths might also differ by 1.
Some (untested) code:
words = set(line.strip() for line in open("words"))
for w in words:
even, odd = w[::2], w[1::2]
if even in words and odd in words:
print even, odd