Finding multiple substrings in a string without iterating over it multiple times

后端 未结 3 574
忘了有多久
忘了有多久 2021-01-04 21:18

I need to find if items from a list appear in a string, and then add the items to a different list. This code works:

data =[]
line = \'akhgvfalfhda.dhgfa.lidh         


        
3条回答
  •  迷失自我
    2021-01-04 21:49

    One way I could think of to improve is:

    • Get all unique lengths of the words in _legal
    • Build a dictionary of words from line of those particular lengths using a sliding window technique. The complexity should be O( len(line)*num_of_unique_lengths ), this should be better than brute force.
    • Now look for each thing in the dictionary in O(1).

    Code:

    line = 'thing1 thing2 456 xxualt542l lthin. dfjladjfj lauthina '
    _legal = ['thing1', 'thing2', 'thing3', 'thing4', 't5', '5', 'fj la']
    ul = {len(i) for i in _legal}
    s=set()
    for l in ul:
        s = s.union({line[i:i+l] for i in range(len(line)-l)})
    print(s.intersection(set(_legal)))
    

    Output:

    {'thing1', 'fj la', 'thing2', 't5', '5'}
    

提交回复
热议问题