问题
Suppose I have a simple array in Python:
>>> x = [1.0, 2.0, 3.0, 4.0]
When pickled, it is a reasonably small size:
>>> pickle.dumps(x).__len__()
44
How come if I use a numpy array, the size is so much larger?
>>> xn = np.array(x)
>>> pickle.dumps(xn).__len__()
187
Converting it to a less precise data type only helps a little bit...
>>> x16 = xn.astype('float16')
>>> pickle.dumps(x16).__len__()
163
Other numpy/scipy data structures like sparse matrices also don't pickle well. Why?
回答1:
Checking it in a debugger, a numpy array has the fields like max, min, type etc apart from the data, which I am not sure a python list has.
A complete list can be found on http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html
As pickling is just a binary copying, these other fields are also being copied, resulting in a larger size.
来源:https://stackoverflow.com/questions/31304006/why-is-there-a-large-overhead-in-pickling-numpy-arrays