I\'m working on doing a logistic regression using MATLAB for a simple classification problem. My covariate is one continuous variable ranging between 0 and 1, while my categoric
It sounds like your data may be linearly separable. In short, that means since your input data is one dimensional, that there is some value of x
such that all values of x < xDiv
belong to one class (say y = 0
) and all values of x > xDiv
belong to the other class (y = 1
).
If your data were two-dimensional this means you could draw a line through your two-dimensional space X
such that all instances of a particular class are on one side of the line.
This is bad news for logistic regression (LR) as LR isn't really meant to deal with problems where the data are linearly separable.
Logistic regression is trying to fit a function of the following form:
This will only return values of y = 0
or y = 1
when the expression within the exponential in the denominator is at negative infinity or infinity.
Now, because your data is linearly separable, and Matlab's LR function attempts to find a maximum likelihood fit for the data, you will get extreme weight values.
This isn't necessarily a solution, but try flipping the labels on just one of your data points (so for some index t
where y(t) == 0
set y(t) = 1
). This will cause your data to no longer be linearly separable and the learned weight values will be dragged dramatically closer to zero.