I am working on a project on automatic document classification. Can someone please help me understand the steps involved in document classification. Should we classify the d