Our company has thousands of PDF documents. How do we create a simple search engine using Lucene, Solr or Nutch? We\'ll provide a basic Java/JSP web page were people can type
A great free search technology you might look at is the IBM Yahoo! free search. I'm not sure whether they followed through on plans to use Lucene under the covers, but it remains one of the really great, east to use free search technologies. It handles up to 500K documents, I believe, and it supports PDF and other non-text formats as well. Graphic user interface; easy to customize search results, and basic search analytics. Basic thesaurus, and powerful API so you can do pretty much whatever you want if the out of the box results are not to your liking. We've suggested this to a number of clients where there were fewer than half a million documents, and they love it.