I\'m trying to apply some binary text classification but I don\'t feel that having millions of >1k length vectors is a good idea. So, which alternatives are there for the bas