问题
Updated for clarity: I need advice for performance when inserting/appending to a capped collection
. I have two python scripts running:
(1) Tailing the cursor.
while WSHandler.cursor.alive:
try:
doc = WSHandler.cursor.next()
self.render(doc)
(2) Inserting like so:
def on_data(self, data): #Tweepy
if (len(data) > 5):
data = json.loads(data)
coll.insert(data) #insert into mongodb
#print(coll.count())
#print(data)
and it's running fine for a while (at 50 inserts/second). Then, after 20-60secs, it stumbles, hits the cpu roof (though it was running at 20% before), and never recovers. My mongostats take a dive (the dive is shown below).
Mongostat output:
The CPU is now choked, by the processes doing the insertion (at least according to htop
).
When I run the Tweepy lines above with print(data)
instead of adding it to db (coll.insert(data)
), everything's running along fine at 15% cpu use.
What I see in mongostats:
res
keeps climbing. (Though clogs may happen at 40m as well as run fine on 100m.)flushes
do not seem to interfere.locked %
is stable at 0.1%. Would this lead to clogging eventually?
(I'm running AWS microinstance; pymongo.)
回答1:
I would suggest using mongostat while running your tests. There are many things that could be wrong but mongostat will give you a good indication.
http://docs.mongodb.org/manual/reference/mongostat/
The first two things I would look at are the lock percentage and the data throughput. With reasonable throughput on dedicated machines I typically get into the 1000-2000 updates/inserts per second before suffering any degradation. This has been the case for several large production deployments I have worked with.
来源:https://stackoverflow.com/questions/12698949/efficiency-when-inserting-into-mongodb-pymongo