Our company has thousands of PDF documents. How do we create a simple search engine using Lucene, Solr or Nutch? We\'ll provide a basic Java/JSP web page were people can type
Nutch + Lucene + Pdf plugin enabled in Nutch is your solution. Nutch allows you to parse pdfs by enabling the pdf plugin.
Lucene will allow you to index the crawled and parsed data and Nutch has servelet which gives you a search interface.
We use the same for our internal lans.