I\'m having problems dealing with unicode characters from text fetched from different web pages (on different sites). I am using BeautifulSoup.
The problem is that
Many answers here (@agf and @Andbdrew for example) have already addressed the most immediate aspects of the OP question.
However, I think there is one subtle but important aspect that has been largely ignored and that matters dearly for everyone who like me ended up here while trying to make sense of encodings in Python: Python 2 vs Python 3 management of character representation is wildly different. I feel like a big chunk of confusion out there has to do with people reading about encodings in Python without being version aware.
I suggest anyone interested in understanding the root cause of OP problem to begin by reading Spolsky's introduction to character representations and Unicode and then move to Batchelder on Unicode in Python 2 and Python 3.