Overlapping count of substring in a string in Python

后端 未结 8 2124
北恋
北恋 2021-01-07 03:42

I want to find all the counts (overlapping and non-overlapping) of a sub-string in a string. I found two answers one of which is using regex which is not my intention and t

8条回答
  •  囚心锁ツ
    2021-01-07 04:22

    Another way to consider is by leveraging the Counter container. While the accepted answer is fastest for shorter strings, if you are searching relatively short substrings within long strings the Counter approach starts to take the edge. Also, if you have need to refactor this to perform multiple substring count queries against the same main string, then the Counter approach starts looking much more attractive

    For example, searching for a substring of length = 3 gave me the following results using timeit;

    Main string length / Accepted Answer / Counter Approach

    6 characters / 4.1us / 7.4us

    50 characters / 24.4us / 25us

    150 characters / 70.7us / 64.9us

    1500 characters / 723us / 614us

    from collections import Counter
    
    def count_w_overlap(search_string, main_string):
        #Split up main_string into all possible overlap possibilities
        search_len = len(search_string)
        candidates = [main_string[i:i+search_len] for i in range(0, len(main_string) - search_len + 1)]
        #Create the Counter container
        freq_count = Counter(candidates)
        return freq_count[search_string]
    

提交回复
热议问题