I am new to SOLR and MONGODB.
I am trying to index data from mongodb into SOLR using DataImportHandler but I could not find the exact steps that I need to follow.
Could you please help me in getting the exact steps to index MongoDB into Solr using DataImportHandler?
SolrVersion - solr-4.6.0
MongoDB version- 2.2.7
Late to answer, however thought people might find it useful.
Below are the steps for importing data from mongodb to Solr 4.7.0 using DataImportHandler.
Step 1:
Assume that your Mongodb has following database and collection
Database Name: Test
Collection Name: sample
The sample
collection has following documents
db.sample.find()
{ "_id" : ObjectId("54c0c6666ee638a21198793b"), "Name" : "Rahul", "EmpNumber" : 452123 }
{ "_id" : ObjectId("54c0c7486ee638a21198793c"), "Name" : "Manohar", "EmpNumber" : 784521 }
Step 2:
Create a lib
folder in your solrhome folder( which has bin
and collection1
folders)
add below jar files to lib
folder. You can download solr-mongo-importer from here!
- solr-dataimporthandler-4.7.0.jar
- solr-mongo-importer-1.0.0.jar
- mongo-java-driver-2.10.1.jar (this is the mongo java driver)
Step 3:
Declare Solr fields in schema.xml(assumed that id is already defined by default)
add below fields in schema.xml inside the <fields> </fields>
tag.
<field name="Name" type="text_general" indexed="true" stored="true"/>
<field name="EmployeeNumber" type="int" indexed="true" stored="true"/>
Step 4:
Declare data-config file in solrconfig.xml by adding below code inside <config> </config>
tag.
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requestHandler>
Step 5:
Create a data-config.xml file in the path collection1\conf\ (which by default holds solrconfig.xml and schema.xml)
data-config.xml
<?xml version="1.0"?>
<dataConfig>
<dataSource name="MyMongo" type="MongoDataSource" database="Test" />
<document name="import">
<!-- if query="" then it imports everything -->
<entity processor="MongoEntityProcessor"
query="{Name:'Rahul'}"
collection="sample"
datasource="MyMongo"
transformer="MongoMapperTransformer" name="sample_entity">
<!-- If mongoField name and the field declared in schema.xml are same than no need to declare below.
If not same than you have to refer the mongoField to field in schema.xml
( Ex: mongoField="EmpNumber" to name="EmployeeNumber"). -->
<field column="_id" name="id"/>
<field column="EmpNumber" name="EmployeeNumber" mongoField="EmpNumber"/>
</entity>
</document>
</dataConfig>
Step 6:
Assuming solr (I have used port 8080) and mongodb are running, open the following link http://localhost:8080/solr/dataimport?command=full-import in your browser for importing data from mongodb to solr.
fields imported are _id,Name and EmpNumber(MongoDB) as id,Name and EmployeeNumber(Solr).
You can see the result in http://localhost:8080/solr/query?q=*
You can try using SolrMongoImporter, it ask you to import 2 libraries into your solr proyect and create a data-config.xml.
You probably will need to import in your solrconfig.xml the following libraries if you don't have it
<lib dir="../../../contrib/dataimporthandler/lib" regex=".*\.jar" />
<lib dir="../../../dist/" regex="solr-dataimporthandler-.*\.jar" />
来源:https://stackoverflow.com/questions/21450555/steps-to-connect-mongodb-and-solr-using-dataimporthandler