I think what I want to do is a fairly common task but I\'ve found no reference on the web. I have text with punctuation, and I want a list of the words.
\"H
def get_words(s): l = [] w = '' for c in s.lower(): if c in '-!?,. ': if w != '': l.append(w) w = '' else: w = w + c if w != '': l.append(w) return l
Here is the usage:
>>> s = "Hey, you - what are you doing here!?" >>> print get_words(s) ['hey', 'you', 'what', 'are', 'you', 'doing', 'here']