A very common source of encoding errors is that python 2 will silently coerce strings to unicode
when you add them together with unicode
. This can
I did a little more research after asking this question and hit on the perfect answer. Armin Ronacher created a wonderful little tool called unicode-nazi. Just install it and run your program like this:
python -Werror -municodenazi myprog.py
and you get a traceback right where the coercion happened:
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "SITE-PACKAGES/unicodenazi.py", line 128, in
main()
File "SITE-PACKAGES/unicodenazi.py", line 119, in main
execfile(sys.argv[0], main_mod.__dict__)
File "myprog.py", line 4, in
print foo()
File "myprog.py", line 2, in foo
return 'bar' + u'baz'
File "SITE-PACKAGES/unicodenazi.py", line 34, in warning_decode
stacklevel=2)
UnicodeWarning: Implicit conversion of str to unicode
If you're dealing with python libraries that trigger implicit coercions themselves and you can't catch the exceptions or otherwise work around them, you can leave out the -Werror
:
python -municodenazi myprog.py
and at least see a warning printed out on stderr when it happens:
/SITE-PACKAGES/unicodenazi.py:119: UnicodeWarning: Implicit conversion of str to unicode
execfile(sys.argv[0], main_mod.__dict__)
barbaz