apostrophe turning into \x92
问题 mycorpus.txt Human where's machine interface for lab abc computer applications A where's survey of user opinion of computer system response time stopwords.txt let's ain't there's The following code corpus = set() for line in open("path\\to\\mycorpus.txt"): corpus.update(set(line.lower().split())) print corpus stoplist = set() for line in open("C:\\Users\\Pankaj\\Desktop\\BTP\\stopwords_new.txt"): stoplist.add(line.lower().strip()) print stoplist gives the following output set(['a', "where's",