How can I count the number of times a given substring is present within a string in Python?
For example:
>>> \'foo bar foo\'.numberOfOccurre
s = 'arunununghhjj'
sb = 'nun'
results = 0
sub_len = len(sb)
for i in range(len(s)):
if s[i:i+sub_len] == sb:
results += 1
print results
One way is to use re.subn. For example, to count the number of
occurrences of 'hello'
in any mix of cases you can do:
import re
_, count = re.subn(r'hello', '', astring, flags=re.I)
print('Found', count, 'occurrences of "hello"')
Below logic will work for all string & special characters
def cnt_substr(inp_str, sub_str):
inp_join_str = ''.join(inp_str.split())
sub_join_str = ''.join(sub_str.split())
return inp_join_str.count(sub_join_str)
print(cnt_substr("the sky is $blue and not greenthe sky is $blue and not green", "the sky"))
I will keep my accepted answer as the "simple and obvious way to do it" - however, that does not cover overlapping occurrences. Finding out those can be done naively, with multiple checking of the slices - as in: sum("GCAAAAAGH"[i:].startswith("AAA") for i in range(len("GCAAAAAGH")))
(which yields 3) - it can be done by trick use of regular expressions, as can be seen at Python regex find all overlapping matches? - and it can also make for fine code golfing - This is my "hand made" count for overlappingocurrences of patterns in a string which tries not to be extremely naive (at least it does not create new string objects at each interaction):
def find_matches_overlapping(text, pattern):
lpat = len(pattern) - 1
matches = []
text = array("u", text)
pattern = array("u", pattern)
indexes = {}
for i in range(len(text) - lpat):
if text[i] == pattern[0]:
indexes[i] = -1
for index, counter in list(indexes.items()):
counter += 1
if text[i] == pattern[counter]:
if counter == lpat:
matches.append(index)
del indexes[index]
else:
indexes[index] = counter
else:
del indexes[index]
return matches
def count_matches(text, pattern):
return len(find_matches_overlapping(text, pattern))
j = 0
while i < len(string):
sub_string_out = string[i:len(sub_string)+j]
if sub_string == sub_string_out:
count += 1
i += 1
j += 1
return count
To find overlapping occurences of a substring in a string in Python 3, this algorithm will do:
def count_substring(string,sub_string):
l=len(sub_string)
count=0
for i in range(len(string)-len(sub_string)+1):
if(string[i:i+len(sub_string)] == sub_string ):
count+=1
return count
I myself checked this algorithm and it worked.