Indexing PDF with Solr

前端 未结 6 1993
一向
一向 2020-12-31 05:46

Can anyone point me to a tutorial.

My main experience with Solr is indexing CSV files. But I cannot find any simple instructions/tutorial to tell me what I need to d

6条回答
  •  野趣味
    野趣味 (楼主)
    2020-12-31 05:55

    Use the Solr, ExtractingRequestHandler. This uses Apache-Tika to parse the pdf file. I believe that it can pull out the metadata etc. You can also pass through your own metadata. Extracting Request Handler

提交回复
热议问题