I am looking for pythonic way to split a sentence into words, and also store the index information of all the words in a sentence e.g
a = \"This is a sentenc
I think it's more natural to return the start and end of the corresponding splices. eg (0, 4) instead of (0, 3)
>>> from itertools import groupby
>>> def splitWithIndices(s, c=' '):
... p = 0
... for k, g in groupby(s, lambda x:x==c):
... q = p + sum(1 for i in g)
... if not k:
... yield p, q # or p, q-1 if you are really sure you want that
... p = q
...
>>> a = "This is a sentence"
>>> list(splitWithIndices(a))
[(0, 4), (5, 7), (8, 9), (10, 18)]
>>> a[0:4]
'This'
>>> a[5:7]
'is'
>>> a[8:9]
'a'
>>> a[10:18]
'sentence'