Find dictionary items whose key matches a substring

前端 未结 5 449
伪装坚强ぢ
伪装坚强ぢ 2020-12-07 20:28

I have a large dictionary constructed like so:

programs[\'New York\'] = \'some values...\' 
programs[\'Port Authority of New York\'] = \'some values...\' 
pr         


        
5条回答
  •  囚心锁ツ
    2020-12-07 21:10

    You could generate all substrings ahead of time, and map them to their respective keys.

    #generates all substrings of s.
    def genSubstrings(s):
        #yield all substrings that contain the first character of the string
        for i in range(1, len(s)+1):
            yield s[:i]
        #yield all substrings that don't contain the first character
        if len(s) > 1:
            for j in genSubstrings(s[1:]):
                yield j
    
    keys = ["New York", "Port Authority of New York", "New York City"]
    substrings = {}
    for key in keys:
        for substring in genSubstrings(key):
            if substring not in substrings:
                substrings[substring] = []
            substrings[substring].append(key)
    

    Then you can query substrings to get the keys that contain that substring:

    >>>substrings["New York"]
    ['New York', 'Port Authority of New York', 'New York City']
    >>> substrings["of New York"]
    ['Port Authority of New York']
    

    Pros:

    • getting keys by substring is as fast as accessing a dictionary.

    Cons:

    • Generating substrings incurs a one-time cost at the beginning of your program, taking time proportional to the number of keys in programs.
    • substrings will grow approximately linearly with the number of keys in programs, increasing the memory usage of your script.
    • genSubstrings has O(n^2) performance in relation to the size of your key. For example, "Port Authority of New York" generates 351 substrings.

提交回复
热议问题