I am trying to extract text only from a number of different coduments (rtf doc pdf). I naturally turned to Apache Tika because it can autodetect the document and extract text ac
So I fudged a workaround and just called System.gc(); everytime the file had finished being processed which works a treat but doesn't really answer the question.