Statsmodels logistic regression convergence problems

大憨熊 提交于 2020-01-02 13:37:12

问题


I'm trying to run a logistic regression in statsmodels on a large design matrix (~200 columns). The features include a number of interactions, categorical features and semi-sparse (70%) integer features. Although my design matrix is not actually ill-conditioned, it seems to be somewhat close (according to numpy.linalg.matrix_rank, it is full-rank with tol=1e-3 but not with tol=1e-2). As a result, I'm struggling to get logistic regression to converge with any of the methods in statsmodels. Here's what I've tried so far:

  • method='newton': Did not converge after 1000 iterations; raised a singular matrix LinAlgError while trying to invert the Hessian.

  • method='bfgs': Warned of possible precision loss. Claimed convergence after 0 iterations, obviously had not actually converged.

  • method='nm': Claimed that it had converged, but model had a negative pseudo-R-squared and many coefficients were still zero (and very different from values they had converged to with better-conditioned submodels). I tried cranking down xtol to 1e-8 to no avail.

  • fit_regularized(method='l1'): reported Inequality constraints incompatible (Exit mode 4). Then raised a singular matrix LinAlgError while trying to compute the restricted Hessian inverse.

来源:https://stackoverflow.com/questions/27413873/statsmodels-logistic-regression-convergence-problems

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!