I want to know how I could perform some kind of index on keys from a python dictionary. The dictionary holds approx. 400,000 items, so I am trying to avoid a linear search.
You could join all the keys into one long string with a suitable separator character and use the find
method of the string. That is pretty fast.
Perhaps this code is helpful to you. The search
method returns a list of dictionary values whose keys contain the substring key
.
class DictLookupBySubstr(object):
def __init__(self, dictionary, separator='\n'):
self.dic = dictionary
self.sep = separator
self.txt = separator.join(dictionary.keys())+separator
def search(self, key):
res = []
i = self.txt.find(key)
while i >= 0:
left = self.txt.rfind(self.sep, 0, i) + 1
right = self.txt.find(self.sep, i)
dic_key = self.txt[left:right]
res.append(self.dic[dic_key])
i = self.txt.find(key, right+1)
return res