I want to save a dict
or arrays.
I try both with np.save
and with pickle
and see that the former always take much less time.>
I think you need better timings. I also disagree with the accepted answer.
b
is a dictionary with 9 keys; the values are lists of arrays. That means both pickle.dump
and np.save
will be using each other - pickle
uses save
to pickle the arrays, save
uses pickle
to save the dictionary and list.
save
writes arrays. That means it has to wrap your dictionary in a object dtype array in order to save it.
In [6]: np.save('test1',b)
In [7]: d=np.load('test1.npy')
In [8]: d
Out[8]:
array({0: [array([0, 0, 0, 0])], 1: [array([1, 0, 0, 0]), array([0, 1, 0, 0]), .... array([ 1, -1, 0, 0]), array([ 1, 0, -1, 0]), array([ 1, 0, 0, -1])]},
dtype=object)
In [9]: d.shape
Out[9]: ()
In [11]: list(d[()].keys())
Out[11]: [0, 1, 2, 3, 4, 5, 6, 7, 8]
Some timings:
In [12]: timeit np.save('test1',b)
850 µs ± 36.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [13]: timeit d=np.load('test1.npy')
566 µs ± 6.44 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [20]: %%timeit
...: with open('testpickle', 'wb') as myfile:
...: pickle.dump(b, myfile)
...:
505 µs ± 9.24 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [21]: %%timeit
...: with open('testpickle', 'rb') as myfile:
...: g1 = pickle.load(myfile)
...:
152 µs ± 4.83 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In my timings pickle
is faster.
The pickle file is slightly smaller:
In [23]: ll test1.npy testpickle
-rw-rw-r-- 1 paul 5740 Aug 14 08:40 test1.npy
-rw-rw-r-- 1 paul 4204 Aug 14 08:43 testpickle