问题
I have all my XML files stored on to the other server and I have installed and configure the SOLR on different server. How can I index those XML files into the SOLR. I have checked nutch but it's main purpose is to crawl the html pages and index them. I don't need to crawl. I have All those files on specific path on other server. I just need to do indexing those XML files in SOLR. I have installed and configure SOLR4.
If anyone have did some thing like this please let me know how to do that. Thank you
回答1:
Why not mount the drive from your Solr server, and do something like:
java -jar post.jar "Z:\home\data\delivery\textarticles.xml"
post.jar is in the exampledocs folder. You might also use it as an example application and build your own application to post those xml files from the other server
回答2:
Take a look at the DataImportHandler. I think you should be able to access a network file if it has the proper permissions set up.
回答3:
Based on your comment to Shane Alexander's answer, you will need to use the URLDataSource option of the DataImportHandler to retrive the file via a Url. Additionally, you will need to incorporate the patch from SOLR-1490 to allow for authentication support.
来源:https://stackoverflow.com/questions/14489450/how-can-i-do-indexing-xml-files-stored-on-other-server-in-solr4