What is the the difference between the three \"all\" methods in Python/NumPy? What is the reason for the performance difference? Is it true that ndarray.all() is always the fast
The difference between np.all(a)
and a.all()
is simple:
a
is a numpy.array
then np.all()
will simply call a.all()
.a
is not a numpy.array
the np.all()
call will convert it to an numpy.array
and then call a.all()
. a.all()
on the other hand will fail because a
wasn't a numpy.array
and therefore probably has no all
method.The difference between np.all
and all
is more complicated.
all
function works on any iterable (including list
, set
s, generators
, ...). np.all
works only for numpy.array
s (including everything that can be converted to a numpy array, i.e. list
s and tuple
s). np.all
processes an array
with specified data type, that makes it pretty efficient when comparing for != 0
. all
however needs to evaluate bool
for each item, that's much slower.np.all
doesn't need to do that conversion.Note that the timings depend also on the type of your a
. If you process a python list all
can be faster for relativly short lists. If you process an array, np.all
and a.all()
will be faster in almost all cases (except maybe for object
arrays, but I won't go down that path, that way lies madness).
I'll take a swing at this
np.all
is a generic function which will work with different data types, under the hood this probably looks for ndarray.all
which is why it's slightly slower.
all
is a python bulit-in function see https://docs.python.org/2/library/functions.html#all.
ndarray.all
is method of the 'numpy.ndarray' object, calling this directly may be faster.
I suspect that numpy's functions do more to evaluate an array element as a boolean, likely in some generic numeric-first way, while the builtin all()
does nothing, since the elements are already booleans.
I wonder how different the results would be with integers of floats.