Remove PDFont caching with Apache tika

前端未结

关注

 1  1022

囚心锁ツ 2021-01-28 01:57

I am trying to extract text only from a number of different coduments (rtf doc pdf). I naturally turned to Apache Tika because it can autodetect the document and extract text ac

1条回答

清酒与你 (楼主)

2021-01-28 02:21

So I fudged a workaround and just called System.gc(); everytime the file had finished being processed which works a treat but doesn't really answer the question.

0 讨论(0)
发布评论:

提交评论
- 加载中...