I have a huge corpus of text (line by line) and I want to remove special characters but sustain the space and structure of the string.
hello? there A-Z-R_T(,**)
I think nfn neil answer is great...but i would just add a simple regex to remove all no words character,however it will consider underscore as part of the word
print re.sub(r'\W+', ' ', string) >>> hello there A Z R_T world welcome to python