How do we create a simple search engine using Lucene, Solr or Nutch?

前端未结

关注

 10  2378

孤城傲影 2021-02-15 11:49

Our company has thousands of PDF documents. How do we create a simple search engine using Lucene, Solr or Nutch? We\'ll provide a basic Java/JSP web page were people can type

10条回答

执念已碎 (楼主)

2021-02-15 12:44

None of the projects in the Lucene family can natively process PDFs, but there are utilities you can drop in and well written examples on how to roll your own.

Lucene will do pretty much whatever you need it to do, but there is overhead in terms of your time, as Tony said above. Thousands of documents really isn't that many, so you might be able to get away with a lighter weight alternative.

That said, I would still recommend looking at Solr - it's much, much easier to set up than Lucene, has support for backups, replication, etc., as well as a nifty JSON interface which would fit your use case very well: http://wiki.apache.org/solr/SolJSON

0 讨论(0)

查看其它10个回答
发布评论:

提交评论
- 加载中...