I am using numpy.log10 to calculate the log of an array of probability values. There are some zeros in the array, and I am trying to get around it using
resu
I solved this by finding the lowest non-zero number in the array and replacing all zeroes by a number lower than the lowest :p
Resulting in a code that would look like:
def replaceZeroes(data):
min_nonzero = np.min(data[np.nonzero(data)])
data[data == 0] = min_nonzero
return data
...
prob = replaceZeroes(prob)
result = numpy.where(prob > 0.0000000001, numpy.log10(prob), -10)
Note that all numbers get a tiny fraction added to them.
You can turn it off with seterr
numpy.seterr(divide = 'ignore')
and back on with
numpy.seterr(divide = 'warn')
This solution worked for me, use numpy.sterr
to turn warnings
off followed by where
numpy.seterr(divide = 'ignore')
df_train['feature_log'] = np.where(df_train['feature']>0, np.log(df_train['feature']), 0)
Just use the where
argument in np.log10
import numpy as np
np.random.seed(0)
prob = np.random.randint(5, size=4) /4
print(prob)
result = np.where(prob > 0.0000000001, prob, -10)
# print(result)
np.log10(result, out=result, where=result > 0)
print(result)
Output
[1. 0. 0.75 0.75]
[ 0. -10. -0.12493874 -0.12493874]
numpy.log10(prob)
calculates the base 10 logarithm for all elements of prob
, even the ones that aren't selected by the where
. If you want, you can fill the zeros of prob
with 10**-10
or some dummy value before taking the logarithm to get rid of the problem. (Make sure you don't compute prob > 0.0000000001
with dummy values, though.)