I have been blowing my brains out over the past 2-3 weeks on this problem. I have a multi-label (not multi-class) problem where each sample can belong to several of the labels.<
The partial_fit()
method was recently added to sklearn
so hopefully it should be available in the upcoming release (it's in the master branch already).
The size of your problem makes it attractive to tackling it with neural networks. Have a look at magpie, it should give much better results than linear classifiers.