I would like to perform a small operation on all entities of a specific kind and rewrite them to the datastore. I currently have 20,000 entities of this kind but would like a solution that would scale to any amount.
What are my options?
Use a mapper - this is part of the MapReduce framework, but you only want the first component, map, as you don't need the shuffle/reduce step if you're simply mutating datastore entities.
Daniel is correct, but if you don't want to mess up with the mapper, that requires you to add another library to your app you can do it using Task Queues or even simpler using the deferred library that is included since SDK 1.2.3.
20.000 entities it's not that dramatic and I assume that this task is not going be performed in regular basis (but even if it does, it is feasible).
Here is an example using NDB and the deferred library (you can easily do that using DB, but consider switching to NDB anyway if you are not already using it). It's a pretty straight forward way, but without caring much about the timeouts:
def update_model(limit=1000):
more_cursor = None
more = True
while more:
model_dbs, more_cursor, more = Model.query().fetch_page(limit, start_cursor=more_cursor)
for model_db in model_dbs:
model_db.updated = True
ndb.put_multi(model_dbs)
logging.info('### %d entities were updated' % len(model_dbs))
class UpdateModelHandler(webapp2.RequestHandler):
def get(self):
deferred.defer(update_model, _queue='queue')
self.response.headers['Content-Type'] = 'text/html'
self.response.out.write('The task has been started!')
来源:https://stackoverflow.com/questions/11203500/updating-a-large-number-of-entities-in-a-datastore-on-google-app-engine