I think what I want to do is a fairly common task but I\'ve found no reference on the web. I have text with punctuation, and I want a list of the words.
\"H
I think the following is the best answer to suite your needs :
\W+ maybe suitable for this case, but may not be suitable for other cases.
\W+
filter(None, re.compile('[ |,|\-|!|?]').split( "Hey, you - what are you doing here!?")