I\'m using the blobstore to backup and recovery entities in csv format. The process is working well for all of my smaller models. However, once I start to work on models with mo
You'd be better off not doing the batching yourself, but just iterating over the query. The iterator will pick a batch size (probably 20) that should be adequate:
q = model.all()
for entity in q:
row = get_dict_for_entity(entity)
writer.writerow(row)
This avoids re-running the query with ever-increasing offset, which is slow and causes quadratic behavior in the datastore.
An oft-overlooked fact about memory usage is that the in-memory representation of an entity can use 30-50 times the RAM compared to the serialized form of the entity; e.g. an entity that is 3KB on disk might use 100KB in RAM. (The exact blow-up factor depends on many factors; it's worse if you have lots of properties with long names and small values, even worse for repeated properties with long names.)