How to find out if Python is compiled with UCS-2 or UCS-4?

家住魔仙堡 提交于 2019-11-26 12:57:15

When built with --enable-unicode=ucs4:

>>> import sys
>>> print sys.maxunicode
1114111

When built with --enable-unicode=ucs2:

>>> import sys
>>> print sys.maxunicode
65535

It's 0xFFFF (or 65535) for UCS-2, and 0x10FFFF (or 1114111) for UCS-4:

Py_UNICODE
PyUnicode_GetMax(void)
{
#ifdef Py_UNICODE_WIDE
    return 0x10FFFF;
#else
    /* This is actually an illegal character, so it should
       not be passed to unichr. */
    return 0xFFFF;
#endif
}

The maximum character in UCS-4 mode is defined by the maxmimum value representable in UTF-16.

I had this same issue once. I documented it for myself on my wiki at

http://arcoleo.org/dsawiki/Wiki.jsp?page=Python%20UTF%20-%20UCS2%20or%20UCS4

I wrote -

import sys
sys.maxunicode > 65536 and 'UCS4' or 'UCS2'

sysconfig will tell the unicode size from the configuration variables of python.

The buildflags can be queried like this.

Python 2.7:

import sysconfig
sysconfig.get_config_var('Py_UNICODE_SIZE')

Python 2.6:

import distutils
distutils.sysconfig.get_config_var('Py_UNICODE_SIZE')
Boris Feld

I had the same issue and found a semi-official piece of code that does exactly that and may be interesting for people with the same issue: https://bitbucket.org/pypa/wheel/src/cf4e2d98ecb1f168c50a6de496959b4a10c6b122/wheel/pep425tags.py?at=default&fileviewer=file-view-default#pep425tags.py-83:89.

It comes from the wheel project which needs to check if the python is compiled with ucs-2 or ucs-4 because it will change the name of the binary file generated.

Another way is to create an Unicode array and look at the itemsize:

import array
bytes_per_char = array.array('u').itemsize

Quote from the array docs:

The 'u' typecode corresponds to Python’s unicode character. On narrow Unicode builds this is 2-bytes, on wide builds this is 4-bytes.

Note that the distinction between narrow and wide Unicode builds is dropped from Python 3.3 onward, see PEP393. The 'u' typecode for array is deprecated since 3.3 and scheduled for removal in Python 4.0.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!