Python 2 assumes different source code encodings

拜拜、爱过 提交于 2019-11-29 16:51:58

The -c and -m switches, ultimately(*) run the code supplied with the exec statement or the compile() function, both of which take Latin-1 source code:

The first expression should evaluate to either a Unicode string, a Latin-1 encoded string, an open file object, a code object, or a tuple.

This is not documented, it's an implementation detail, that may or may not be considered a bug.

I don't think it is something that is worth fixing however, and Latin-1 is a superset of ASCII so little is lost. How code from -c and -m is handled has been cleaned up in Python 3 and is much more consistent there; code passed in with -c is decoded using the current locale, and modules loaded with the -m switch default to UTF-8, as usual.


(*) If you want to know the exact implementations used, start at the Py_Main() function in Modules/main.c, which handles both -c and -m as:

if (command) {
    sts = PyRun_SimpleStringFlags(command, &cf) != 0;
    free(command);
} else if (module) {
    sts = RunModule(module, 1);
    free(module);
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!