I have lots of documents (~100k) that I need to cluster in different categories (receipts, invoices, welcome letters, etc..) and been trying different methods to do so.