To understand the differences between numpy
and array
, I ran a few more quantitative test.
What I have found is that, for my system (Ubuntu 18.04, Python3), array
seems to be twice as fast at generating a large array from the range
generator compared to numpy
(although numpy
's dedicated np.arange()
seems to be much faster -- actually too fast, and perhaps it is caching something during tests), but twice as slow than using list
.
However, quite surprisingly, array
objects seems to be larger than the numpy
counterparts.
Instead, the list
objects are roughly 8-13% larger than array
objects (this will vary with the size of the individual items, obviously).
Compared to list
, array
offers a way to control the size of the number objects.
So, perhaps, the only sensible use case for array
is actually when numpy
is not available.
For completeness, here is the code that I used for the tests:
import numpy as np
import array
import sys
num = int(1e6)
num_i = 100
x = np.logspace(1, int(np.log10(num)), num_i).astype(int)
%timeit list(range(num))
# 10 loops, best of 3: 32.8 ms per loop
%timeit array.array('l', range(num))
# 10 loops, best of 3: 86.3 ms per loop
%timeit np.array(range(num), dtype=np.int64)
# 10 loops, best of 3: 180 ms per loop
%timeit np.arange(num, dtype=np.int64)
# 1000 loops, best of 3: 809 µs per loop
y_list = np.array([sys.getsizeof(list(range(x_i))) for x_i in x])
y_array = np.array([sys.getsizeof(array.array('l', range(x_i))) for x_i in x])
y_np = np.array([sys.getsizeof(np.array(range(x_i), dtype=np.int64)) for x_i in x])
import matplotlib.pyplot as plt
plt.figure(figsize=(12, 6))
plt.plot(x, y_list, label='list')
plt.plot(x, y_array, label='array')
plt.plot(x, y_np, label='numpy')
plt.legend()
plt.show()