问题
I have a MongoDB collection (used as a job queue), from which multiple processes read records, using findAndModify
. FindAndModify
searches for records where the active
field is "false", setting it to "true", so that other processes do not read the same record.
The problem is that looking at logs, I see that different processes still read the same records. This seems to occur when two processes read from the queue at the same time. Is there any way to make sure that one only process reads from a collection at a time?
I am using Mongo 2.2.3 and pymongo 2.2.
Many thanks!
EDIT: The logs in question are:
worker.3 2013-03-18 23:57:45,434 default-worker-3
project_name INFO Queue job: job id: 5147a90f68e8fe0097002bdfworker.3 2013-03-18 23:57:47,608 default-worker-3
project_name INFO Input: 14497 docsworker.2 2013-03-18 23:57:45,440 default-worker-2
project_name INFO Queue job: job id: 5147a90f68e8fe0097002bdfworker.2 2013-03-18 23:57:47,658 default-worker-2
project_name INFO Input: 14497 docs
As you can see, worker.3 and worker.2 read the same job from the queue (it has the same mongodb id for both workers).
The find_and_modify command:
query = {"active": False}
try:
return self.collection.find_and_modify(
query=query,
update={"$set": {"active": True}},
upsert=False,
sort={"added_on": 1},
limit=1
)
except Exception, exc:
LOGGER.exception(exc)
回答1:
First make a straightforward query to mongodb to get a single job record,
job = db.coll.find({query}).limit(1)
Next update the record specifying the job id and where active=false
update_response = db.coll.update(
{_id:job.id, active=false},
{$set:{active:true}},
false,
false
)
If the job has already been updated by another process, the update will not succeed due to the query constraint of active=false
. Check in the update_response that the record was updated:
if update_response['n'] > 0 and update_response['updatedExisting']==true:
return job
If your update didn't succeed then fetch another job and try again.
回答2:
Let me make something very clear - it is NOT possible for two different findAndModify commands to return the same document in your scenario.
Not possible. Here are the first few lines of the method that performs the work:
Lock::DBWrite lk( ns );
Client::Context cx( ns );
BSONObj doc;
bool found = Helpers::findOne( ns.c_str() , queryOriginal , doc );
Note line 122 where a WRITE lock is acquired BEFORE the find.
https://github.com/mongodb/mongo/blob/master/src/mongo/db/commands/find_and_modify.cpp#L122
It is not possible for two processes to hold the write lock at the same time. It seems more likely that something different is going on (multiple documents having the same id value, same document being returned by the function that's calling find_and_modify returning it to two threads, something else we don't know enough to speculate about).
FindAndModify is an atomic command which holds an exclusive write lock for the duration of its execution. My suggestion would be to get to the bottom of what the logs are really showing you rather than making changes to your code based on incorrect/unwarranted assumptions about what must be going on.
回答3:
Split your "locking" in two stages instead. First update record querying first object that don't have lock timestamp or have expired timestamp and setting new lock. Then just find same object using lock data you've just established.
来源:https://stackoverflow.com/questions/15505176/a-way-to-ensure-exclusive-reads-in-mongodbs-findandmodify