I'd probably just use itertools.groupby
:
>>> import itertools as it
>>> s = 'AAATGG'
>>> for k, g in it.groupby(s):
... print(k, list(g))
...
('A', ['A', 'A', 'A'])
('T', ['T'])
('G', ['G', 'G'])
>>>
>>> # Multiple non-consecutive occurrences of a given value.
>>> s = 'AAATTGGAAA'
>>> for k, g in it.groupby(s):
... print(k, list(g))
...
('A', ['A', 'A', 'A'])
('T', ['T', 'T'])
('G', ['G', 'G'])
('A', ['A', 'A', 'A'])
As you can see, g
becomes an iterable that yields all consecutive occurrences of the given character (k
). I used list(g)
, to consume the iterable, but you could do anything you like with it (including ''.join(g)
to get a string, or sum(1 for _ in g)
to get the count).