How to split line at non-printing ascii character in Python
How can I split a line in Python at a non-printing ascii character (such as the long minus sign hex 0x97 , Octal 227)? I won't need the character itself. The information after it will be saved as a variable. miku You can use re.split . >>> import re >>> re.split('\W+', 'Words, words, words.') ['Words', 'words', 'words', ''] Adjust the pattern to only include the characters you want to keep. See also: stripping-non-printable-characters-from-a-string-in-python Example (w/ the long minus): >>> # \xe2\x80\x93 represents a long dash (or long minus) >>> s = 'hello – world' >>> s 'hello \xe2\x80\x93