问题
sys.getsizeof
is returning different size for a unicode string on different versions of python.
sys.getsizeof(u'Hello World')
return 96
on Python 2.7.3
and returns 72
on Python 2.7.11
回答1:
sys.getsizeof
is giving you implementation details by definition, and none of those details are guaranteed to remain stable between versions or even builds.
It's unlikely that anything significant changed between 2.7.3 and 2.7.11 though; YOU's comment on character width likely explains the discrepancy; including the internally stored NUL terminator, there are 12 characters in Hello World
, and UCS4 encoding would require 24 more bytes to store them than UCS2 encoding (but in exchange, it could handle non-BMP characters).
Other things that could change size (in other circumstances) would be 32 vs. 64 bit builds (all pointers and ssize_t
s double in size on 64 bit builds, as do long
s on non-Windows machines), Python 2 vs. Python 3 (Python 3 removed a single pointer width field from the common object header), and for str
, Python 3.2 (which uses build option specified fixed width UCS2 or UCS4 str
, same as Py2 unicode
) vs. Python 3.3+ (which uses one of three different fixed widths depending on the largest ordinal in the str, so an ASCII/latin-1 str
uses one byte per character, a BMP str
uses two, and a non-BMP str
uses four, but can also cache alternate representations, so the same str
can grow or shrink in "real" size based on usage).
回答2:
sys.getsizeof Can differ on different computers. However I think this can solve your issues. Take the size of a string for example and subtract the size of an empty string.
import sys
def get_size_of_string(s):
return sys.getsizeof(s)-sys.getsizeof("")
a=get_size_of_string("abc")
print (a)
来源:https://stackoverflow.com/questions/36152435/python-sys-getsizeof-method-returning-different-sizes-on-different-versions-of-p