问题
I am looking for a way to sync collections in MongoDB with Elastic Search (ES). The goal is to have MongoDB as a primary data source and use MongoDB as a full text search engine. (The business logic of my project is written in python).
Several approaches are online available.
- Mongo-connect
- River plugin
- logstash-input-mongodb (logstash plugin) see similar question
- Transporter
However, most of the suggestions are several years old and I could not find any solution that supports the current version of ES (ES 7.4.0). Is anyone using such a construct? Do you have any suggestions?
I thought about dropping MongoDB as primary data source and just using ES for storing and searching. Though I have read that ES should not be used as a primary data source.
Edit
Thank you @gurdeep.sabarwal. I followed your approach. However, I do not manage to sync the mongodb to ES. My configuration looks like this:
input {
jdbc {
# jdbc_driver_library => "/usr/share/logstash/mongodb-driver-3.11.0-source.jar"
jdbc_driver_library => "/usr/share/logstash/mongojdbc1.5.jar"
# jdbc_driver_library => "/usr/share/logstash/mongodb-driver-3.11.1.jar"
# jdbc_driver_class => "mongodb.jdbc.MongoDriver"
# jdbc_driver_class => "Java::com.mongodb.MongoClient"
jdbc_driver_class => "Java::com.dbschema.MongoJdbcDriver"
jdbc_driver_class => "com.dbschema.MongoJdbcDriver"
# jdbc_driver_class => ""
jdbc_connection_string => "jdbc:mongodb://<myserver>:27017/<mydb>"
jdbc_user => "user"
jdbc_password => "pw"
statement => "db.getCollection('mycollection').find({})"
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200/"]
index => "myindex"
}
}
This brings me a bit closer to my goal. However, I get the following error:
Error: Java::com.dbschema.MongoJdbcDriver not loaded. Are you sure you've included the correct jdbc driver in :jdbc_driver_library?
Exception: LogStash::ConfigurationError`
Since, it did not work, I tried also the commented version but did not succeed.
回答1:
- download https://dbschema.com/jdbc-drivers/MongoDbJdbcDriver.zip
- unzip and copy all the files to the path(~/logstash-7.4.2/logstash-core/lib/jars/)
- modify the config file(mongo-logstash.conf) below:
- run: ~/logstash-7.4.2/bin/logstash -f mongo-logstash.conf
- success, please try it!
ps: this is my first answer in stackoverflow :-)
input {
jdbc{
# NOT THIS # jdbc_driver_class => "Java::mongodb.jdbc.MongoDriver"
jdbc_driver_class => "com.dbschema.MongoJdbcDriver"
jdbc_driver_library => "mongojdbc1.5.jar"
jdbc_user => "" #no user and pwd
jdbc_password => ""
jdbc_connection_string => "jdbc:mongodb://127.0.0.1:27017/db1"
statement => "db.t1.find()"
}
}
output {
#stdout { codec => dots }
stdout { }
}
回答2:
For ELK stack, I have implemented using (1st and 2nd ) approach and while doing research i came accross multiple appraches , so you could pick anyone. but my personal choice is 1st or 2nd becoz it give you lots of option for customization.
if you need code let me know,i can share snippet of it. i don't want to make answer long!.
1.Use dbSchemeJdbc jar(https://dbschema.com) to stream data from mongodb to ElasticSearch.
a.OpenSource dbSchemeJdbc jar
b.You could write native mongodb query or aggregation query directly in logstash.
your pipeline may look like the following:
input {
jdbc{
jdbc_user => "user"
jdbc_password => "pass"
jdbc_driver_class => "Java::com.dbschema.MongoJdbcDriver"
jdbc_driver_library => "mongojdbc1.2.jar"
jdbc_connection_string => "jdbc:mongodb://user:pass@host1:27060/cdcsmb"
statement => "db.product.find()"
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => "localhost:9200"
index => "target_index"
document_type => "document_type"
document_id => "%{id}"
}
}
2.Use unityJdbc jar (http://unityjdbc.com) to stream data from mongodb to ElasticSearch
a.You have to pay for unityjdbc jar
b.You could write SQL format query in logstash to get data from mongodb.
your pipeline may look like the following:
input {
jdbc{
jdbc_user => "user"
jdbc_password => "pass"
jdbc_driver_class => "Java::mongodb.jdbc.MongoDriver"
jdbc_driver_library => "mongodb_unityjdbc_full.jar"
jdbc_connection_string => "jdbc:mongodb://user:pass@host1:27060/cdcsmb"
statement=> "SELECT * FROM employee WHERE status = 'active'"
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => "localhost:9200"
index => "target_index"
document_type => "document_type"
document_id => "%{id}"
}
}
3.Use logstash-input-mongodb(https://github.com/phutchins/logstash-input-mongodb) plugin to stream data from mongodb to ElasticSearch
a.opensource kind of
b.you get very less option for customization, it will dump entire collection, you can not write query or write aggregation query etc .
4.you can write you own program in python or java and connect with mongodb and index data in elastic search,then you can use cron to schedule it.
5.you can use node js Mongoosastic npm(https://www.npmjs.com/package/mongoosastic), the only overhead of this is it will commit change on mongo and ES both to keep it in sync.
回答3:
Monstache seems a good option too as it support the latests versions of both elasticsearch and mongodb: https://github.com/rwynn/monstache
来源:https://stackoverflow.com/questions/58342818/sync-mongodb-to-elasticsearch