I have the following code:
r = numpy.zeros(shape = (width, height, 9))
It creates a width x height x 9
matrix filled with zero
I compared the suggested alternatives for speed and found that, for large enough vectors/matrices to fill, all alternatives except val * ones
and array(n * [val])
are equally fast.
Code to reproduce the plot:
import numpy
import perfplot
val = 42.0
def fill(n):
a = numpy.empty(n)
a.fill(val)
return a
def colon(n):
a = numpy.empty(n)
a[:] = val
return a
def full(n):
return numpy.full(n, val)
def ones_times(n):
return val * numpy.ones(n)
def list(n):
return numpy.array(n * [val])
perfplot.show(
setup=lambda n: n,
kernels=[fill, colon, full, ones_times, list],
n_range=[2 ** k for k in range(20)],
logx=True,
logy=True,
xlabel="len(a)",
)
As said, numpy.empty() is the way to go. However, for objects, fill() might not do exactly what you think it does:
In[36]: a = numpy.empty(5,dtype=object)
In[37]: a.fill([])
In[38]: a
Out[38]: array([[], [], [], [], []], dtype=object)
In[39]: a[0].append(4)
In[40]: a
Out[40]: array([[4], [4], [4], [4], [4]], dtype=object)
One way around can be e.g.:
In[41]: a = numpy.empty(5,dtype=object)
In[42]: a[:]= [ [] for x in range(5)]
In[43]: a[0].append(4)
In[44]: a
Out[44]: array([[4], [], [], [], []], dtype=object)
Another option is to use numpy.full, an option available in NumPy 1.8+
a = np.full([height, width, 9], np.nan)
This is pretty flexible and you can fill it with any other number that you want.
You can always use multiplication if you don't immediately recall the .empty
or .full
methods:
>>> np.nan * np.ones(shape=(3,2))
array([[ nan, nan],
[ nan, nan],
[ nan, nan]])
Of course it works with any other numerical value as well:
>>> 42 * np.ones(shape=(3,2))
array([[ 42, 42],
[ 42, 42],
[ 42, 42]])
But the @u0b34a0f6ae's accepted answer is 3x faster (CPU cycles, not brain cycles to remember numpy syntax ;):
$ python -mtimeit "import numpy as np; X = np.empty((100,100));" "X[:] = np.nan;"
100000 loops, best of 3: 8.9 usec per loop
(predict)laneh@predict:~/src/predict/predict/webapp$ master
$ python -mtimeit "import numpy as np; X = np.ones((100,100));" "X *= np.nan;"
10000 loops, best of 3: 24.9 usec per loop
You rarely need loops for vector operations in numpy. You can create an uninitialized array and assign to all entries at once:
>>> a = numpy.empty((3,3,))
>>> a[:] = numpy.nan
>>> a
array([[ NaN, NaN, NaN],
[ NaN, NaN, NaN],
[ NaN, NaN, NaN]])
I have timed the alternatives a[:] = numpy.nan
here and a.fill(numpy.nan)
as posted by Blaenk:
$ python -mtimeit "import numpy as np; a = np.empty((100,100));" "a.fill(np.nan)"
10000 loops, best of 3: 54.3 usec per loop
$ python -mtimeit "import numpy as np; a = np.empty((100,100));" "a[:] = np.nan"
10000 loops, best of 3: 88.8 usec per loop
The timings show a preference for ndarray.fill(..)
as the faster alternative. OTOH, I like numpy's convenience implementation where you can assign values to whole slices at the time, the code's intention is very clear.
Note that ndarray.fill
performs its operation in-place, so numpy.empty((3,3,)).fill(numpy.nan)
will instead return None
.
Are you familiar with numpy.nan
?
You can create your own method such as:
def nans(shape, dtype=float):
a = numpy.empty(shape, dtype)
a.fill(numpy.nan)
return a
Then
nans([3,4])
would output
array([[ NaN, NaN, NaN, NaN],
[ NaN, NaN, NaN, NaN],
[ NaN, NaN, NaN, NaN]])
I found this code in a mailing list thread.