I need to write a function which will detect if the input contains at least one value which is non-numeric. If a non-numeric value is found I will raise an error (because the ca
Pfft! Microseconds! Never solve a problem in microseconds that can be solved in nanoseconds.
Note that the accepted answer:
A better solution is to return True immediately when NAN is found:
import numba
import numpy as np
NAN = float("nan")
@numba.njit(nogil=True)
def _any_nans(a):
for x in a:
if np.isnan(x): return True
return False
@numba.jit
def any_nans(a):
if not a.dtype.kind=='f': return False
return _any_nans(a.flat)
array1M = np.random.rand(1000000)
assert any_nans(array1M)==False
%timeit any_nans(array1M) # 573us
array1M[0] = NAN
assert any_nans(array1M)==True
%timeit any_nans(array1M) # 774ns (!nanoseconds)
and works for n-dimensions:
array1M_nd = array1M.reshape((len(array1M)/2, 2))
assert any_nans(array1M_nd)==True
%timeit any_nans(array1M_nd) # 774ns
Compare this to the numpy native solution:
def any_nans(a):
if not a.dtype.kind=='f': return False
return np.isnan(a).any()
array1M = np.random.rand(1000000)
assert any_nans(array1M)==False
%timeit any_nans(array1M) # 456us
array1M[0] = NAN
assert any_nans(array1M)==True
%timeit any_nans(array1M) # 470us
%timeit np.isnan(array1M).any() # 532us
The early-exit method is 3 orders or magnitude speedup (in some cases). Not too shabby for a simple annotation.