I\'m creating a sort of background job queue system with MongoDB as the data store. How can I \"listen\" for inserts to a MongoDB collection before spawning workers to proce
Many of these answers will only give you new records and not updates and/or are extremely ineffecient
The only reliable, performant way to do this is to create a tailable cursor on local db: oplog.rs collection to get ALL changes to MongoDB and do with it what you will. (MongoDB even does this internally more or less to support replication!)
Explanation of what the oplog contains: https://www.compose.com/articles/the-mongodb-oplog-and-node-js/
Example of a Node.js library that provides an API around what is available to be done with the oplog: https://github.com/cayasso/mongo-oplog
MongoDB has what is called capped collections and tailable cursors that allows MongoDB to push data to the listeners.
A capped collection
is essentially a collection that is a fixed size and only allows insertions. Here's what it would look like to create one:
db.createCollection("messages", { capped: true, size: 100000000 })
Ruby
coll = db.collection('my_collection')
cursor = Mongo::Cursor.new(coll, :tailable => true)
loop do
if doc = cursor.next_document
puts doc
else
sleep 1
end
end
PHP
$mongo = new Mongo();
$db = $mongo->selectDB('my_db')
$coll = $db->selectCollection('my_collection');
$cursor = $coll->find()->tailable(true);
while (true) {
if ($cursor->hasNext()) {
$doc = $cursor->getNext();
print_r($doc);
} else {
sleep(1);
}
}
Python (by Robert Stewart)
from pymongo import Connection
import time
db = Connection().my_db
coll = db.my_collection
cursor = coll.find(tailable=True)
while cursor.alive:
try:
doc = cursor.next()
print doc
except StopIteration:
time.sleep(1)
Perl (by Max)
use 5.010;
use strict;
use warnings;
use MongoDB;
my $db = MongoDB::Connection->new;
my $coll = $db->my_db->my_collection;
my $cursor = $coll->find->tailable(1);
for (;;)
{
if (defined(my $doc = $cursor->next))
{
say $doc;
}
else
{
sleep 1;
}
}
Ruby/Node.js Tutorial which walks you through creating an application that listens to inserts in a MongoDB capped collection.
An article talking about tailable cursors in more detail.
PHP, Ruby, Python, and Perl examples of using tailable cursors.
Since MongoDB 3.6 there will be a new notifications API called Change Streams which you can use for this. See this blog post for an example. Example from it:
cursor = client.my_db.my_collection.changes([
{'$match': {
'operationType': {'$in': ['insert', 'replace']}
}},
{'$match': {
'newDocument.n': {'$gte': 1}
}}
])
# Loops forever.
for change in cursor:
print(change['newDocument'])
MongoDB version 3.6 now includes change streams which is essentially an API on top of the OpLog allowing for trigger/notification-like use cases.
Here is a link to a Java example: http://mongodb.github.io/mongo-java-driver/3.6/driver/tutorials/change-streams/
A NodeJS example might look something like:
var MongoClient = require('mongodb').MongoClient;
MongoClient.connect("mongodb://localhost:22000/MyStore?readConcern=majority")
.then(function(client){
let db = client.db('MyStore')
let change_streams = db.collection('products').watch()
change_streams.on('change', function(change){
console.log(JSON.stringify(change));
});
});
What you are thinking of sounds a lot like triggers. MongoDB does not have any support for triggers, however some people have "rolled their own" using some tricks. The key here is the oplog.
When you run MongoDB in a Replica Set, all of the MongoDB actions are logged to an operations log (known as the oplog). The oplog is basically just a running list of the modifications made to the data. Replicas Sets function by listening to changes on this oplog and then applying the changes locally.
Does this sound familiar?
I cannot detail the whole process here, it is several pages of documentation, but the tools you need are available.
First some write-ups on the oplog - Brief description - Layout of the local collection (which contains the oplog)
You will also want to leverage tailable cursors. These will provide you with a way to listen for changes instead of polling for them. Note that replication uses tailable cursors, so this is a supported feature.