I have a project where we have the structured data stored in a database and a significant amount of unstructured data that is generated in the form of PDF\'s, images and text fi