When I do:
summing += yval * np.log(sigmoid(np.dot(w.transpose(), xi.transpose()))) + (1-yval)* np.log(1-sigmoid(np.dot(w.transpose(), xi.transpose())))
Even though it's late, this answer might help someone else.
In the part of your code.
... + (1-yval)* np.log(1-sigmoid(np.dot(w.transpose(), xi.transpose())))
may be the np.dot(w.transpose(), xi.transpose())
function is spitting larger values(above 40 or so), resulting in the output of sigmoid( )
to be 1
. And then you're basically taking np.log
of 1-1
that is 0
. And as DevShark has mentioned above, it causes the RuntimeWarning: Divide by zero...
error.
How I came up with the number 40 you might ask, well, it's just that for values above 40 or so sigmoid function in python(numpy) returns 1.
.
Looking at your implementation, it seems you're dealing with the Logistic Regression algorithm, in which case(I'm under the impression that) feature scaling is very important.
Since I'm writing answer for the first time, It is possible I may have violated some rules/regulations, if that is the case I'd like to apologise.
That's the warning you get when you try to evaluate log with 0:
>>> import numpy as np
>>> np.log(0)
__main__:1: RuntimeWarning: divide by zero encountered in log
I agree it's not very clear.
So in your case, I would check why your input to log is 0.
PS: this is on numpy 1.10.4
Try to add a very small value, e.g., 1e-7, to the input. For example, sklearn library has a parameter eps
for the log_loss function.
https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/discussion/48701
I had this same problem. It looks like you're trying to do logistic regression. I was doing MULTI-CLASS Classification with logistic regression. But you need to solve this problem using the ONE VS ALL approach (google for details).
If you don't set your yval variable so that only has '1' and '0' instead of yval = [1,2,3,4,...] etc., then you will get negative costs which lead to runaway theta and then lead to you reaching the limit of log(y) where y is close to zero.
The fix should be to pre-treat your yval variable so that it only has '1' and '0' for positive and negative examples.